# SPSS tutorials

BASICS REGRESSION T-TEST ANOVA CORRELATION

# SPSS RECODE – Simple Tutorial

SPSS RECODE replaces data values with different values. It comes in handy for merging categories, dichotomizing continuous variables and some other tasks. This tutorial walks you through its main options, best practices and pitfalls.

SPSS Recode Example 1

For quickly getting very proficient with RECODE it's recommended you follow along with the examples. You'll soon notice that recoding from syntax is very simple and way, way faster than from the GUI. All examples use supermarket.sav.

## 1. Merge Categories of One Variable

In this example we'll merge categories 1 and 2 of a variable v1. We'll do this by changing all values of 1 into 2. This is as simple as recode v1 (1 = 2).The screenshot illustrates the effect. All values that are not 1 are left unaltered. We'll run FREQUENCIES right before and after recoding so we can check the results.

## SPSS Recode Syntax Example 1

*1. Get values and value labels in output and inspect frequencies.

set tnumbers both.

freq v1.

*2. Recode v1 and correct value labels.

recode v1 (1=2).

add value labels v1 2 'Not at all or a bit' 1 ''.

*3. Check with previous frequency table.

freq v1.

Note that after recoding the value labels are no longer correct.For more on this, see SPSS Recode - Cautionary Note. We therefore adjust the value label for 2 and remove the label for 1.

## 2. Dichotomize Multiple Variables

SPSS Recode Example 2

We'll dichotomize variables v4 to v6 by changing values 1, 2 and 3 into 0 and values 4 and 5 into 1 as implied byrecode v4 to v6 (1,2,3 = 0)(4,5 = 1).Value 6 is is left unaltered. After recoding we must respecify the value labels for all three variables. The reason why we need two quotes in don''t know is explained in Escape Sequence (General Concept).

## SPSS Recode Syntax Example 2

*1. Inspect frequencies.

freq v4 to v6.

*2. Recode and apply new value labels.

recode v4 to v6 (1,2,3 = 0)(4,5 = 1).

value labels v4 to v6 0 'Bottom three' 1 'Top two' 6 'Don''t know'.

*3. Check against previous frequencies.

freq v4 to v6.

## 3. Merge Categories into New Variable

In the previous examples the original values were overwritten by the recoded values. An alternative is creating a new variable holding the recoded values. This is done by using the INTO keyword like sorecode v2 (1=2) into rec_v2.However, this doesn't tell which values rec_v2 should hold if v2 is not 1, resulting in lots of system missing values. Here we can use ELSE, which means “all values that were not previously addressed”. For copying them from v2 into rec_v2 we'll use (ELSE = COPY).

## SPSS Recode Syntax Example 3

*1. Recode v2 into rec_v2.

recode v2 (1=2)(else=copy) into rec_v2.

*2. Cross old with new values as check.

crosstabs v2 by rec_v2 /cells count /missing include.

*Note: rec_v2 doesn't have labels or missing values defined yet.
A crosstab confirms that categories 1 and 2 have been merged into 2.

This example shows some disadvantages of recoding into new variables. First, note that the new variables don't have any dictionary information at all.
Second, the new variables are appended to the end of the active dataset. Therefore, you can't adddress a range of original and recoded variables by using the TO ALL keywords. However, an easy way to reorder is using MATCH FILES.

## 4. Dichotomize Multiple Variables into New Variables

Recoding several variables into several new variables is straightforward: simply fill in multiple input variable names after RECODE and multiple output variable names after INTO. Just make sure that the number of input variables matches the number of output variables.
This example uses LO THRU 3 which means “the lowest value through 3”. In a similar vein, HI can be used for the highest value.
Optionally, users who have the SPSS Python Essentials installed can generate the crosstabs in a loop as shown in step 3B.

## SPSS Recode Syntax Example 4

*1. Check frequencies.

freq v7 to v9.

*2. Recode.

recode v7 to v9 (lo thru 3 = 0)(4,5 = 1)(else = 2) into rec_v7 to rec_v9.

*3A. Check against original values.

crosstabs v7 by rec_v7 /cells count /missing include.
crosstabs v8 by rec_v8 /cells count /missing include.
crosstabs v9 by rec_v9 /cells count /missing include.

*3B. Alternative for 3A - have Python generate crosstabs.

set mprint on.

begin program.
import spss
for suff in range(7,10):
spss.Submit('crosstabs v%(suff)d by rec_v%(suff)d /cells count /missing include.'%locals())
end program.

## 5. Recode Continuous into Discrete Variable

Values are recoded only once by RECODE. The old and new value pairs are read from left to right and an old value that's already been addressed will be ignored if it's addressed again. This is also the reason that there's no point in specifying any old values after the ELSE keyword.
This feature is sometimes used when discretizing continuous variables: you can use LO (the lowest value that hasn't been previously addressed) as the lower boundary for each category. The syntax below looks a bit awkward but is not unusual. As demonstrated, a descriptives by category table is a nice way to inspect these results. Finally, note that RANK offers an alternative for discretizing variables.

## SPSS Recode Syntax Example 5

*1. Recode income into income classes.

recode income (lo thru 2000 = 1)(lo thru 2500 = 2)(lo thru 3000 = 3)(lo thru 3500 = 4)(lo thru hi = 5) into income_class.

*2. Check income descriptives per income class.

means income by income_class
/cells count min mean max.

## 6. Clone a Variable

A disadvantage of recoding into new variables is they don't have any dictionary information by default. However, we can clone a variable with its dictionary information by combining RECODE with APPLY DICTIONARY. This is basically what our SPSS Clone Variables Tool does for many variables at once.The tool also checks whether input variables are string variables. If so, it automatically declares the new string variables with the correct lengths that are needed for recoding into.
After cloning, we can safely recode into the same variables, leaving the variable order intact and minimizing the need for dictionary modifications after recoding. In case of doubt we can always check the recoded variable against its clone and if necessary delete it and start over from a new clone.

## SPSS Recode Syntax Example 6

*1. Clone values into new variable.

recode v10 (else = copy) into rec_v10.

*2. Clone dictionary onto new variable.

apply dictionary from * /source variables = v10 /target variables = rec_v10.

*3. Check.

crosstabs v10 by rec_v10 /cells count /missing include.

## 7. Recode String to Numeric Variable

In some cases you may want to recode a string variable into a numeric one. This holds especially when you want to do calculations on ordinal variables under the Assumption of Equal Intervals.Note that we can't use AUTORECODE here because we don't want our values to follow the alphabetical order of our string values.
Keep in mind that you can RECODE and apply value labels to many variables at once. Unfortunately, copying the variable labels from the old to the new variables requires some more work but this can be automated with Python if desired.

## SPSS Recode Syntax Example 7

*1. Create mini dataset.

data list free / s1(a10).
begin data
end data.

*2. Recode string into numeric variable.

recode s1 ('Very bad' = 1)('Bad' = 2) ('Neutral' = 3)('Good' = 4)('Very good' = 5) into n1.
exe.

*3. Apply value labels.

value labels n1 1 'Very bad' 2 'Bad' 3 'Neutral' 4 'Good' 5 'Very good'.

## Final Notes

This tutorial didn't cover some more exotic RECODE options. The reason is that we rarely see these in practice and we didn't want to go into detail any further than we already did. Some more options than described here are covered by the command syntax reference.

# Let me know what you think!

*Required field. Your comment will show up after approval from a moderator.

# This tutorial has 6 comments

• ### By Ruben Geert van den Berg on September 5th, 2016

Hi Brandon! If I understand correctly, you're not looking for RECODE because -9999 will not be changed into a different number. Instead, try

MISSING VALUES [VARIABLE NAMES GO HERE] (-9999).

If you now run any results (means/frequencies) over these variables, you'll see that -9999 will be excluded from the results as if these were empty cells. For more, please see MISSING VALUES.

Hope that helps!