SPSS RECODE replaces data values with different values. It comes in handy for merging categories, dichotomizing continuous variables and some other tasks. This tutorial walks you through its main options, best practices and pitfalls.SPSS Recode Example 1
For quickly getting very proficient with RECODE it's recommended you follow along with the examples. You'll soon notice that recoding from syntax is very simple and way, way faster than from the GUI. All examples use supermarket.sav.
1. Merge Categories of One Variable
In this example we'll merge categories 1 and 2 of a variable v1. We'll do this by changing all values of 1 into 2. This is as simple as recode v1 (1 = 2).The screenshot illustrates the effect. All values that are not 1 are left unaltered. We'll run FREQUENCIES right before and after recoding so we can check the results.
SPSS Recode Syntax Example 1
set tnumbers both.
*2. Recode v1 and correct value labels.
recode v1 (1=2).
add value labels v1 2 'Not at all or a bit' 1 ''.
*3. Check with previous frequency table.
Note that after recoding the value labels are no longer correct.For more on this, see SPSS Recode - Cautionary Note. We therefore adjust the value label for 2 and remove the label for 1.
2. Dichotomize Multiple VariablesSPSS Recode Example 2
We'll dichotomize variables v4 to v6 by changing values 1, 2 and 3 into 0 and values 4 and 5 into 1 as implied byrecode v4 to v6 (1,2,3 = 0)(4,5 = 1).Value 6 is is left unaltered. After recoding we must respecify the value labels for all three variables. The reason why we need two quotes in don''t know is explained in Escape Sequence (General Concept).
SPSS Recode Syntax Example 2
freq v4 to v6.
*2. Recode and apply new value labels.
recode v4 to v6 (1,2,3 = 0)(4,5 = 1).
value labels v4 to v6 0 'Bottom three' 1 'Top two' 6 'Don''t know'.
*3. Check against previous frequencies.
freq v4 to v6.
3. Merge Categories into New Variable
In the previous examples the original values were overwritten by the recoded values. An alternative is creating a new variable holding the recoded values. This is done by using the INTO keyword like sorecode v2 (1=2) into rec_v2.However, this doesn't tell which values rec_v2 should hold if v2 is not 1, resulting in lots of system missing values. Here we can use ELSE, which means “all values that were not previously addressed”. For copying them from v2 into rec_v2 we'll use (ELSE = COPY).
SPSS Recode Syntax Example 3
recode v2 (1=2)(else=copy) into rec_v2.
*2. Cross old with new values as check.
crosstabs v2 by rec_v2 /cells count /missing include.
*Note: rec_v2 doesn't have labels or missing values defined yet.
This example shows some disadvantages of recoding into new variables. First, note that the new variables don't have any dictionary information at all.
Second, the new variables are appended to the end of the active dataset. Therefore, you can't address a range of original and recoded variables by using the TO ALL keywords. However, an easy way to reorder is using MATCH FILES.
4. Dichotomize Multiple Variables into New Variables
Recoding several variables into several new variables is straightforward: simply fill in multiple input variable names after RECODE and multiple output variable names after INTO. Just make sure that the number of input variables matches the number of output variables.
This example uses LO THRU 3 which means “the lowest value through 3”. In a similar vein, HI can be used for the highest value.
Optionally, users who have the SPSS Python Essentials installed can generate the crosstabs in a loop as shown in step 3B.
SPSS Recode Syntax Example 4
freq v7 to v9.
recode v7 to v9 (lo thru 3 = 0)(4,5 = 1)(else = 2) into rec_v7 to rec_v9.
*3A. Check against original values.
crosstabs v7 by rec_v7 /cells count /missing include.
crosstabs v8 by rec_v8 /cells count /missing include.
crosstabs v9 by rec_v9 /cells count /missing include.
*3B. Alternative for 3A - have Python generate crosstabs.
set mprint on.
for suff in range(7,10):
spss.Submit('crosstabs v%(suff)d by rec_v%(suff)d /cells count /missing include.'%locals())
5. Recode Continuous into Discrete Variable
Values are recoded only once by RECODE. The old and new value pairs are read from left to right and an old value that's already been addressed will be ignored if it's addressed again. This is also the reason that there's no point in specifying any old values after the ELSE keyword.
This feature is sometimes used when discretizing continuous variables: you can use LO (the lowest value that hasn't been previously addressed) as the lower boundary for each category. The syntax below looks a bit awkward but is not unusual. As demonstrated, a descriptives by category table is a nice way to inspect these results. Finally, note that RANK offers an alternative for discretizing variables.
SPSS Recode Syntax Example 5
recode income (lo thru 2000 = 1)(lo thru 2500 = 2)(lo thru 3000 = 3)(lo thru 3500 = 4)(lo thru hi = 5) into income_class.
*2. Check income descriptives per income class.
means income by income_class
/cells count min mean max.
6. Clone a Variable
A disadvantage of recoding into new variables is they don't have any dictionary information by default. However, we can clone a variable with its dictionary information by combining RECODE with APPLY DICTIONARY. This is basically what our SPSS Clone Variables Tool does for many variables at once.The tool also checks whether input variables are string variables. If so, it automatically declares the new string variables with the correct lengths that are needed for recoding into.
After cloning, we can safely recode into the same variables, leaving the variable order intact and minimizing the need for dictionary modifications after recoding. In case of doubt we can always check the recoded variable against its clone and if necessary delete it and start over from a new clone.
SPSS Recode Syntax Example 6
recode v10 (else = copy) into rec_v10.
*2. Clone dictionary onto new variable.
apply dictionary from * /source variables = v10 /target variables = rec_v10.
crosstabs v10 by rec_v10 /cells count /missing include.
7. Recode String to Numeric Variable
In some cases you may want to recode a string variable into a numeric one. This holds especially when you want to do calculations on ordinal variables under the Assumption of Equal Intervals.Note that we can't use AUTORECODE here because we don't want our values to follow the alphabetical order of our string values.
Keep in mind that you can RECODE and apply value labels to many variables at once. Unfortunately, copying the variable labels from the old to the new variables requires some more work but this can be automated with Python if desired.
SPSS Recode Syntax Example 7
data list free / s1(a10).
'Very bad' 'Bad' 'Neutral' 'Good' 'Very good'
*2. Recode string into numeric variable.
recode s1 ('Very bad' = 1)('Bad' = 2) ('Neutral' = 3)('Good' = 4)('Very good' = 5) into n1.
*3. Apply value labels.
value labels n1 1 'Very bad' 2 'Bad' 3 'Neutral' 4 'Good' 5 'Very good'.
This tutorial didn't cover some more exotic RECODE options. The reason is that we rarely see these in practice and we didn't want to go into detail any further than we already did. Some more options than described here are covered by the command syntax reference.
THIS TUTORIAL HAS 8 COMMENTS:
By Ruben Geert van den Berg on September 5th, 2016
Hi Brandon! If I understand correctly, you're not looking for RECODE because -9999 will not be changed into a different number. Instead, try
MISSING VALUES [VARIABLE NAMES GO HERE] (-9999).
If you now run any results (means/frequencies) over these variables, you'll see that -9999 will be excluded from the results as if these were empty cells. For more, please see MISSING VALUES.
Hope that helps!
By Andrea Tristan on December 15th, 2020
I'm still new using SPSS, but for some reason the dataset supermakert.sav shows the following warning and stops the execution of any commands:
The FREQUENCIES command requires a variable list. The most common cause of this error is a variable list that contains one or more undefined variables, and the variable list is not preceded by the optional VARIABLES keyword.
I would like to try out the examples as you present them but I'm not sure yet how to fix the dataset to try them out.
By Ruben Geert van den Berg on December 16th, 2020
Hi Andrea, thanks for your comment!
Before running any FREQUENCIES, make sure that supermarket.sav is open and that it's the only data file that's open.
In SPSS, you can have 2 or more data files open at once (this is better avoided at all times). In this case, the FREQUENCIES can be applied to the other open dataset which may not contain a variable V1 and that typically triggers the error you see.
I recommend going to Edit - Options (General Tab) and select "Open only one dataset at a time".
Hope that helps!