Many easy options have been proposed for combining the values of categorical variables in SPSS. However, the real information is usually in the value labels instead of the values. This tutorial proposes a simple trick for combining categorical variables and automatically applying correct value labels to the result.
SPSS Combine Categorical Variables Example
You may follow along by downloading and opening hospital.sav. Now say we'd like to combine “doctor_rating” and “nurse_rating” (near the end of the file). The result is shown in the screenshot below. Note that all variables are numeric with proper value labels applied to them.
SPSS Combine Categorical Variables Syntax
We first present the syntax that does the trick. Next, we'll point out how it how to easily use it on other data files.
string tmp(a1000).
*2. Combine values and value labels of doctor_rating and nurse_rating into tmp string variable.
compute tmp = concat(
"doctor_rating = ",string(doctor_rating,f1)," (",rtrim(valuelabels(doctor_rating)),") ",
"nurse_rating = ",string(nurse_rating,f1)," (",rtrim(valuelabels(nurse_rating)),") "
).
*3. Convert string variable into numeric.
autorecode tmp
/into doctor_and_nurse_rating.
*4. Delete tmp string variable.
delete variables tmp.
*5. Optionally, apply variable label to end result.
variable labels doctor_and_nurse_rating 'Combination of doctor_rating and nurse_rating'.
SPSS Combine Categorical Variables - Other Data
We realize that many readers may find this syntax too difficult to rewrite for their own data files. So instead of rewriting it, just copy and paste it and make three basic adjustments before running it:
- replace “doctor_rating” by the name of the first variable you'd like to combine. Note that you can do so by using the ctrl + h shortkey.
- replace “nurse_rating” by the name of the second variable you'd like to combine.
- replace “doctor_and_nurse_rating” by the variable name you'd like to use for the final result.
SPSS Combine Categorical Variables - System Missing Values
You may have noticed that the value labels of the combined variable don't look very nice if system missing values are present in the original values. An example of such a value label is doctor_rating = 3 (Neutral) nurse_rating = . (). A nicer result can be obtained without changing the basic syntax for combining categorical variables. Prior to running this syntax, simply RECODE system missing values. Use a value that's not yet present in the original variables and apply a value label to it. The syntax below shows how to do so.
recode doctor_rating nurse_rating (sysmis = 7).
*2. Apply value label to new value.
add value labels doctor_rating nurse_rating 7 'System missing'.
*3. Proceed with remaining syntax from here.
After doing so, the resulting value label will look as follows: doctor_rating = 3 (Neutral) nurse_rating = 7 (System missing). Further, note that the syntax we used made a couple of assumptions. Most real world data will satisfy those. We'll walk through them below.
SPSS Combine Categorical Variables - Assumptions
- Although the syntax combines two variables, it can be expanded to incorporate three or more variables.
- It is assumed that all values in the original variables consist of single digits. If two or three digit values are present, replace
f1
byn2
orn3
.Then3
format left pads numbers with zeroes and thus keeps their alphabetical order equal to their numerical order.
Further Reading
Those who'd like a closer look at some of the commands and functions we combined in this tutorial may want to consult string variables, STRING function, VALUELABEL, CONCAT, RTRIM and AUTORECODE.
THIS TUTORIAL HAS 26 COMMENTS:
By Christine Gulla on June 2nd, 2016
Thank you, this actually solved it, your're great!
By Steph on July 29th, 2016
Hi there,
Thanks for the tutorial.I am trying to use your syntax for four variables not two but I can't seem to figure out to amend the code in No (2). Hope you can help!
By Ruben Geert van den Berg on August 1st, 2016
Hi Steph! I can imagine the syntax being a difficult to modify as it combines many functions into a single line. I added some line breaks to the original that -hopefully- hint at how to add variables to it: copy the second line and replace "doctor_rating" (all instances) with another variable name.
Following this line of thought, I wrote a tiny example that combines four variables and uploaded it to Combine Four Categorical Variables with Value Labels. Please note that lines 2, 3, 4 and 5 are very similar although line 5 -the last variable- doesn't end with a comma as nothing is added to it. I hope this example makes the whole thing a bit more understandable.
P.s. I might -at some point- build a menu based tool that creates the right syntax for this but that may take another couple of months as I'm very full.
By Daniel on December 3rd, 2016
This actually worked! Thank you so much Ruben Geert van den Berg.
By Ruben Geert van den Berg on December 3rd, 2016
Hi Daniel, of course it worked. Nevertheless, the syntax scares off many SPSS users. I'd rather build a simple menu based tool for this but I've such a long "to-do" list that I just can't find the time. Hopefully next year.
Best,
Ruben