In SPSS, we sometimes encounter variables that are negatively coded: lower values indicate higher agreement or more positive sentiments. Although this is not really “wrong” in any way, positive coding is more intuitive. Especially when reporting means over variables, most readers naturally expect higher means to indicate something better, not worse. So what's the best way to reverse code variables? And how to ensure the result is correct?
Example - Employee Survey
A tiny employee survey was held in which employees rated some job aspects. The data are in reversed-items.sav, part of which is shown below.
1. Inspect Coding
Inspecting how values have been coded is one of my routine checks for categorical variables. I usually just run a quick FREQUENCIES command which tells me basically all I need to know.
set tnumbers both tvars both.
*Create frequency tables and bar charts for v1 through v10.
frequencies v1 to v10
The first frequency distribution shows that v1 is positively coded as shown below.
Scrolling down a bit, it appears that v2, v3, v6 and v10 are all negatively coded. I'll reverse the coding for just these 4 variables in the next steps.
2. Copy Value Labels into New Variables
To make sure I'll do everything correctly, I want to compare the new and old values after reversing some items. The best way to do so is using the SPSS Clone Variables Tool. However, plain syntax will do as well: I'll pass the value labels into some new string variables. If this doesn't make any sense yet: read on. It will in a minute.
string s2 s3 s6 s10 (a25).
*Save value labels of v2, v3, v6 and v10 into new strings.
do repeat #new = s2 s3 s6 s10 / #old = v2 v3 v6 v10.
compute #new = valuelabel(#old).
3. Recode and Adjust Value Labels
I'll now fix the problem in 2 steps:
- I'll RECODE values 1, 2, 4 and 5 into 5, 4, 2 and 1. Values 3 (Neutral) and 6 (Don't know) are as desired so I'll skip those.
- I'll adjust the value labels of the recoded values accordingly.
recode v2 v3 v6 v10 (1 = 5)(2 = 4)(4 = 2)(5 = 1).
*Correct value labels for v2, v3, v6 and v10.
add value labels v2 v3 v6 v10 1 "Totally disagree" 2 "Somewhat disagree" 4 "Somewhat Agree" 5 "Totally agree".
4. Check Results
I'm basically done. However, I want to make sure my results are correct and I can do so with some simple CROSSTABS as shown below. If you reversed many many variables, you could loop over these commands with Python but it's not worth the effort for just 4 variables.
crosstabs v6 by s6.
All cases who responded “Neutral” on the original variable -copied into s6- still respond “Neutral” on the reversed variable v6. This holds for all values of all reversed variables. The only thing we changed are the numeric values underlying these answers -exactly as planned.
By recoding into the same variables, we retained all dictionary information such as variable labels, formats and user missing values (still need to be set for these data). On top of that, all variables still have their original names and order.
5. Clean Things Up
Now that we're done, our new string variables are redundant so let's remove them. We'll also set 6 (“Don't know”) as a user missing value.
delete variables s2 to s10.
*Set 6 as user missing value for questionnaire items.
missing values v1 to v10 (6).
Thanks for reading!