SPSS RECODE replaces data values with different values. It comes in handy for merging categories, dichotomizing continuous variables and some other tasks. This tutorial walks you through its main options, best practices and pitfalls.
SPSS Recode Example 1For quickly getting very proficient with RECODE it's recommended you follow along with the examples. You'll soon notice that recoding from syntax is very simple and way, way faster than from the GUI. All examples use supermarket.sav.
1. Merge Categories of One Variable
In this example we'll merge categories 1 and 2 of a variable v1. We'll do this by changing all values of 1 into 2. This is as simple as recode v1 (1 = 2).The screenshot illustrates the effect. All values that are not 1 are left unaltered. We'll run FREQUENCIES right before and after recoding so we can check the results.
SPSS Recode Syntax Example 1
set tnumbers both.
freq v1.
*2. Recode v1 and correct value labels.
recode v1 (1=2).
add value labels v1 2 'Not at all or a bit' 1 ''.
*3. Check with previous frequency table.
freq v1.
Note that after recoding the value labels are no longer correct.For more on this, see SPSS Recode - Cautionary Note. We therefore adjust the value label for 2 and remove the label for 1.
2. Dichotomize Multiple Variables
SPSS Recode Example 2We'll dichotomize variables v4 to v6 by changing values 1, 2 and 3 into 0 and values 4 and 5 into 1 as implied byrecode v4 to v6 (1,2,3 = 0)(4,5 = 1).Value 6 is is left unaltered. After recoding we must respecify the value labels for all three variables. The reason why we need two quotes in don''t know is explained in Escape Sequence (General Concept).
SPSS Recode Syntax Example 2
freq v4 to v6.
*2. Recode and apply new value labels.
recode v4 to v6 (1,2,3 = 0)(4,5 = 1).
value labels v4 to v6 0 'Bottom three' 1 'Top two' 6 'Don''t know'.
*3. Check against previous frequencies.
freq v4 to v6.
3. Merge Categories into New Variable
In the previous examples the original values were overwritten by the recoded values. An alternative is creating a new variable holding the recoded values. This is done by using the INTO keyword like sorecode v2 (1=2) into rec_v2.However, this doesn't tell which values rec_v2 should hold if v2 is not 1, resulting in lots of system missing values. Here we can use ELSE, which means “all values that were not previously addressed”. For copying them from v2 into rec_v2 we'll use (ELSE = COPY).
SPSS Recode Syntax Example 3
recode v2 (1=2)(else=copy) into rec_v2.
*2. Cross old with new values as check.
crosstabs v2 by rec_v2 /cells count /missing include.
*Note: rec_v2 doesn't have labels or missing values defined yet.
This example shows some disadvantages of recoding into new variables. First, note that the new variables don't have any dictionary information at all.
Second, the new variables are appended to the end of the active dataset. Therefore, you can't address a range of original and recoded variables by using the TO ALL keywords. However, an easy way to reorder is using MATCH FILES.
4. Dichotomize Multiple Variables into New Variables
Recoding several variables into several new variables is straightforward: simply fill in multiple input variable names after RECODE and multiple output variable names after INTO. Just make sure that the number of input variables matches the number of output variables.
This example uses LO THRU 3 which means “the lowest value through 3”. In a similar vein, HI can be used for the highest value.
Optionally, users who have the SPSS Python Essentials installed can generate the crosstabs in a loop as shown in step 3B.
SPSS Recode Syntax Example 4
freq v7 to v9.
*2. Recode.
recode v7 to v9 (lo thru 3 = 0)(4,5 = 1)(else = 2) into rec_v7 to rec_v9.
*3A. Check against original values.
crosstabs v7 by rec_v7 /cells count /missing include.
crosstabs v8 by rec_v8 /cells count /missing include.
crosstabs v9 by rec_v9 /cells count /missing include.
*3B. Alternative for 3A - have Python generate crosstabs.
set mprint on.
begin program.
import spss
for suff in range(7,10):
spss.Submit('crosstabs v%(suff)d by rec_v%(suff)d /cells count /missing include.'%locals())
end program.
5. Recode Continuous into Discrete Variable
Values are recoded only once by RECODE. The old and new value pairs are read from left to right and an old value that's already been addressed will be ignored if it's addressed again. This is also the reason that there's no point in specifying any old values after the ELSE keyword.
This feature is sometimes used when discretizing continuous variables: you can use LO (the lowest value that hasn't been previously addressed) as the lower boundary for each category. The syntax below looks a bit awkward but is not unusual. As demonstrated, a descriptives by category table is a nice way to inspect these results. Finally, note that RANK offers an alternative for discretizing variables.
SPSS Recode Syntax Example 5
recode income (lo thru 2000 = 1)(lo thru 2500 = 2)(lo thru 3000 = 3)(lo thru 3500 = 4)(lo thru hi = 5) into income_class.
*2. Check income descriptives per income class.
means income by income_class
/cells count min mean max.
6. Clone a Variable
A disadvantage of recoding into new variables is they don't have any dictionary information by default. However, we can clone a variable with its dictionary information by combining RECODE with APPLY DICTIONARY. This is basically what our SPSS Clone Variables Tool does for many variables at once.The tool also checks whether input variables are string variables. If so, it automatically declares the new string variables with the correct lengths that are needed for recoding into.
After cloning, we can safely recode into the same variables, leaving the variable order intact and minimizing the need for dictionary modifications after recoding. In case of doubt we can always check the recoded variable against its clone and if necessary delete it and start over from a new clone.
SPSS Recode Syntax Example 6
recode v10 (else = copy) into rec_v10.
*2. Clone dictionary onto new variable.
apply dictionary from * /source variables = v10 /target variables = rec_v10.
*3. Check.
crosstabs v10 by rec_v10 /cells count /missing include.
7. Recode String to Numeric Variable
In some cases you may want to recode a string variable into a numeric one. This holds especially when you want to do calculations on ordinal variables under the Assumption of Equal Intervals.Note that we can't use AUTORECODE here because we don't want our values to follow the alphabetical order of our string values.
Keep in mind that you can RECODE and apply value labels to many variables at once. Unfortunately, copying the variable labels from the old to the new variables requires some more work but this can be automated with Python if desired.
SPSS Recode Syntax Example 7
data list free / s1(a10).
begin data
'Very bad' 'Bad' 'Neutral' 'Good' 'Very good'
end data.
*2. Recode string into numeric variable.
recode s1 ('Very bad' = 1)('Bad' = 2) ('Neutral' = 3)('Good' = 4)('Very good' = 5) into n1.
exe.
*3. Apply value labels.
value labels n1 1 'Very bad' 2 'Bad' 3 'Neutral' 4 'Good' 5 'Very good'.
Final Notes
This tutorial didn't cover some more exotic RECODE options. The reason is that we rarely see these in practice and we didn't want to go into detail any further than we already did. Some more options than described here are covered by the command syntax reference.
THIS TUTORIAL HAS 8 COMMENTS:
By Jo on June 23rd, 2016
Hi
I'm having a hard time figuring out how to merge several numeric variables but keeping all cases including those with the same value.
The reason is I'm looking at twins and therefor have
Var1=1 'child1_with_Value1' and
Var2=1 'child2_with_Value1'
but when I recode them into one variable it only includes Var1 (aka only one child).
I hope you can help me with a smart way of doing this? :)
By Ruben Geert van den Berg on June 23rd, 2016
Hi Jo!
Why do you want to combine several variables into one in the first place? My experience is that it's often done without any real need for it.
Now, you can't combine multiple variables into one with RECODE. One trick that does work is the formula
compute combined = v1 + v2 * 10 + v3 * 100.
but it requires all original values are single digits. A bigger problem with this is that the new variable won't have any value labels. If you'd like to avoid that, see Combine Categorical Variables but I should add that it's somewhat technical (I should have built a tool for this syntax but I don't have the time for doing so).
Let me know if that gets you any further, ok?
By walter Ayella on June 27th, 2016
yes,its was an interested tutorial but what I'm looking for i have not seen.I want to record variable like V001,V002,V003,V004..... to become one variable like V001 only
By Ruben Geert van den Berg on June 28th, 2016
Hi Walter!
First, why would you want to combine many variables into one anyway? It often turns out that SPSS users do so without really needing it.
Now, you can't recode multiple variables into one with RECODE. The standard approach here is to add up all variables and multiply the last variable by 1, the second last variable by 10, the third last variable by 100 and so on. For a complete example -with test data- see SPSS - Recode Multiple Variables into One Variable Example.
This is fast but it only works if the original values are all single digits. Another issue is that the combined variable doesn't have any value labels but it's not obvious what value labels you'd like to have in this case. We might try and build a tool for this job some day because the question is asked pretty often and there isn't any really good solution for it yet.
Hope that helps!
By Brandon Gray on September 4th, 2016
Hello,
I would like to recode all variables in my data set. I transfered this SPSS file from a Stata file wherein I had all missing values coded as -9999. I would now like to recode this so that SPSS recognizes these values as missing Thank you for your help