RECODE replaces data values with different values. It comes in handy for merging categories, dichotomizing continuous variables and some other tasks. This tutorial walks you through its main options, best practices and pitfalls.
For quickly getting very proficient with
RECODE it's recommended you follow along with the examples. You'll soon notice that recoding from syntax is very simple and way, way faster than from the GUI. All examples use supermarket.sav.
1. Merge Categories of One Variable
In this example we'll merge categories
2 of a variable
v1. We'll do this by changing all values of
2. This is as simple as recode v1 (1 = 2).The screenshot illustrates the effect. All values that are not
1 are left unaltered. We'll run FREQUENCIES right before and after recoding so we can check the results.
SPSS Recode Syntax Example 1
set tnumbers both.
*2. Recode v1 and correct value labels.
recode v1 (1=2).
add value labels v1 2 'Not at all or a bit' 1 ''.
*3. Check with previous frequency table.
Note that after recoding the value labels are no longer correct.For more on this, see SPSS Recode - Cautionary Note. We therefore adjust the value label for
2 and remove the label for
2. Dichotomize Multiple VariablesSPSS Recode Example 2
We'll dichotomize variables
v4 to v6 by changing values
0 and values
1 as implied byrecode v4 to v6 (1,2,3 = 0)(4,5 = 1).Value
6 is is left unaltered. After recoding we must respecify the value labels for all three variables. The reason why we need two quotes in
don''t know is explained in Escape Sequence (General Concept).
SPSS Recode Syntax Example 2
freq v4 to v6.
*2. Recode and apply new value labels.
recode v4 to v6 (1,2,3 = 0)(4,5 = 1).
value labels v4 to v6 0 'Bottom three' 1 'Top two' 6 'Don''t know'.
*3. Check against previous frequencies.
freq v4 to v6.
3. Merge Categories into New Variable
In the previous examples the original values were overwritten by the recoded values. An alternative is creating a new variable holding the recoded values. This is done by using the
INTO keyword like sorecode v2 (1=2) into rec_v2.However, this doesn't tell which values
rec_v2 should hold if
v2 is not
1, resulting in lots of system missing values. Here we can use
ELSE, which means “all values that were not previously addressed”. For copying them from
rec_v2 we'll use
(ELSE = COPY).
SPSS Recode Syntax Example 3
recode v2 (1=2)(else=copy) into rec_v2.
*2. Cross old with new values as check.
crosstabs v2 by rec_v2 /cells count /missing include.
*Note: rec_v2 doesn't have labels or missing values defined yet.
This example shows some disadvantages of recoding into new variables. First, note that the new variables don't have any dictionary information at all.
Second, the new variables are appended to the end of the active dataset. Therefore, you can't adddress a range of original and recoded variables by using the TO ALL keywords. However, an easy way to reorder variables is using MATCH FILES.
4. Dichotomize Multiple Variables into New Variables
Recoding several variables into several new variables is straightforward: simply fill in multiple input variable names after
RECODE and multiple output variable names after
INTO. Just make sure that the number of input variables matches the number of output variables.
This example uses
LO THRU 3 which means “the lowest value through 3”. In a similar vein,
HI can be used for the highest value.
Optionally, users who have the SPSS Python Essentials installed can generate the crosstabs in a loop as shown in step 3B.
SPSS Recode Syntax Example 4
freq v7 to v9.
recode v7 to v9 (lo thru 3 = 0)(4,5 = 1)(else = 2) into rec_v7 to rec_v9.
*3A. Check against original values.
crosstabs v7 by rec_v7 /cells count /missing include.
crosstabs v8 by rec_v8 /cells count /missing include.
crosstabs v9 by rec_v9 /cells count /missing include.
*3B. Alternative for 3A - have Python generate crosstabs.
set mprint on.
for suff in range(7,10):
spss.Submit('crosstabs v%(suff)d by rec_v%(suff)d /cells count /missing include.'%locals())
5. Recode Continuous into Discrete Variable
Values are recoded only once by
RECODE. The old and new value pairs are read from left to right and an old value that's already been addressed will be ignored if it's addressed again. This is also the reason that there's no point in specifying any old values after the
This feature is sometimes used when discretizing continuous variables: you can use
LO (the lowest value that hasn't been previously addressed) as the lower boundary for each category. The syntax below looks a bit awkward but is not unusual. As demonstrated, a descriptives by category table is a nice way to inspect these results. Finally, note that RANK offers an alternative for discretizing variables.
SPSS Recode Syntax Example 5
recode income (lo thru 2000 = 1)(lo thru 2500 = 2)(lo thru 3000 = 3)(lo thru 3500 = 4)(lo thru hi = 5) into income_class.
*2. Check income descriptives per income class.
means income by income_class
/cells count min mean max.
6. Clone a Variable
A disadvantage of recoding into new variables is they don't have any dictionary information by default. However, we can clone a variable with its dictionary information by combining
APPLY DICTIONARY. This is basically what our SPSS Clone Variables Tool does for many variables at once.The tool also checks whether input variables are string variables. If so, it automatically declares the new string variables with the correct lengths that are needed for recoding into.
After cloning, we can safely recode into the same variables, leaving the variable order intact and minimizing the need for dictionary modifications after recoding. In case of doubt we can always check the recoded variable against its clone and if necessary delete it and start over from a new clone.
SPSS Recode Syntax Example 6
recode v10 (else = copy) into rec_v10.
*2. Clone dictionary onto new variable.
apply dictionary from * /source variables = v10 /target variables = rec_v10.
crosstabs v10 by rec_v10 /cells count /missing include.
7. Recode String to Numeric Variable
In some cases you may want to recode a string variable into a numeric one. This holds especially when you want to do calculations on ordinal variables under the Assumption of Equal Intervals.Note that we can't use AUTORECODE here because we don't want our values to follow the alphabetical order of our string values.
Keep in mind that you can
RECODE and apply value labels to many variables at once. Unfortunately, copying the variable labels from the old to the new variables requires some more work but this can be automated with Python if desired.
SPSS Recode Syntax Example 7
data list free / s1(a10).
'Very bad' 'Bad' 'Neutral' 'Good' 'Very good'
*2. Recode string into numeric variable.
recode s1 ('Very bad' = 1)('Bad' = 2) ('Neutral' = 3)('Good' = 4)('Very good' = 5) into n1.
*3. Apply value labels.
value labels n1 1 'Very bad' 2 'Bad' 3 'Neutral' 4 'Good' 5 'Very good'.
This tutorial didn't cover some more exotic
RECODE options. The reason is that we rarely see these in practice and we didn't want to go into detail any further than we already did. Some more options than described here are covered by the CSR.