Question
“I have some empty categories in my data. For example, none of my respondents filled out "politician" as their job. Therefore, "politician" is completely absent from crosstabs. How can I include it with a zero frequency?”
How to include an empty category in a crosstab?Solution
The basic solution here is to add one or more fake cases to the data that do have the absent values. Giving them case weights close to zero includes them in crosstabs with zero frequencies. Although this solution requires a couple of steps, all of them are pretty basic.
Run the syntax below in order to set up the data and run the initial crosstabs. Note that none of our respondents works as a politician (job = 2). Therefore, "politician" doesn't show up in the crosstab. The same goes for "law" as education (education = 3).
SPSS Syntax Example 1
*1. Create data.
data list free/job education gender.
begin data
1 1 0 1 1 0 1 2 0 1 2 0 3 1 1 3 1 1 3 2 1 3 2 1
end data.
dataset name data.
*2. Apply value labels.
value labels job 1 'Statistician' 2 'Politician' 3 'Teacher'.
value labels education 1 'Economics' 2 'Social Sciences' 3 'Law'.
value labels gender 0 'Female' 1 'Male'.
*3. Show value labels in output and make crosstabs.
set tnumbers labels.
crosstabs gender by job education.
data list free/job education gender.
begin data
1 1 0 1 1 0 1 2 0 1 2 0 3 1 1 3 1 1 3 2 1 3 2 1
end data.
dataset name data.
*2. Apply value labels.
value labels job 1 'Statistician' 2 'Politician' 3 'Teacher'.
value labels education 1 'Economics' 2 'Social Sciences' 3 'Law'.
value labels gender 0 'Female' 1 'Male'.
*3. Show value labels in output and make crosstabs.
set tnumbers labels.
crosstabs gender by job education.
Creating Fake Cases
- We'll now create a second dataset holding fake cases. These cases have the absent categories (job = 2 and law = 3) as data values. It doesn't matter what
gender
they have as long as it's not a missing value. For this example we need only one fake case. - Second, we COMPUTE a variable called
caseweight
and set it to a very small value.Note that1E-5
is a shorthand for1 * 10**-5
; it's an easy way for writing0.00001
. Later on, when we'll WEIGHT our cases, these small weights will generate the desired zeroes in our crosstabs. - Finally, we'll merge our fake case into our actual data with ADD FILES and close all but the merged dataset. Don't worry about the original data, we'll easily restore it in a minute.
SPSS Syntax Example 2
*1. Create fake cases.
data list free/job education gender.
begin data
2 3 0
end data.
dataset name fakecases.
*2. Assign very small case weights.
compute caseweight = 1E-5.
*3. Merge fake cases into data.
add files file = data/file = fakecases.
exe.
*4. Close original datasets.
dataset close all.
dataset name alldata.
data list free/job education gender.
begin data
2 3 0
end data.
dataset name fakecases.
*2. Assign very small case weights.
compute caseweight = 1E-5.
*3. Merge fake cases into data.
add files file = data/file = fakecases.
exe.
*4. Close original datasets.
dataset close all.
dataset name alldata.
Creating the Desired Tables
- After merging the fake case(s) into the original data, real cases have system missing values on
caseweight
. A short IF command using the MISSING function quickly fixes this. - After setting the weight in effect, each real case counts as a single case since they all have
caseweight = 1
. Each fake case counts as almost zero cases. - Now the same
CROSSTABS
command we used previously includes the empty categories. - As promised, we'll finally delete the fake case by using SELECT IF. Fake cases are readily recognized by their small case weights.
- After doing so, the weight variable is no longer needed either. Deleting it restores the original data.
SPSS Syntax Example 3
*1. Case weight = 1 for real cases.
if missing(caseweight) caseweight = 1.
*2. Switch on case weights.
weight by caseweight.
*3. Generate desired crosstabs.
crosstabs gender by education job.
*4. When done, delete fake cases.
select if caseweight = 1.
exe.
*5. Deleting caseweight restores original data.
delete variables caseweight.
if missing(caseweight) caseweight = 1.
*2. Switch on case weights.
weight by caseweight.
*3. Generate desired crosstabs.
crosstabs gender by education job.
*4. When done, delete fake cases.
select if caseweight = 1.
exe.
*5. Deleting caseweight restores original data.
delete variables caseweight.
THIS TUTORIAL HAS 1 COMMENT:
By Linda Martell on July 15th, 2015
Good