In SPSS, we sometimes encounter variables that are negatively coded: lower values indicate higher agreement or more positive sentiments. Although this is not really “wrong” in any way, positive coding is more intuitive. Especially when reporting means over variables, most readers naturally expect higher means to indicate something better, not worse. So what's the best way to reverse code variables? And how to ensure the result is correct?
Example - Employee Survey
A tiny employee survey was held in which employees rated some job aspects. The data are in reversed-items.sav, part of which is shown below.
1. Inspect Coding
Inspecting how values have been coded is one of my routine checks for categorical variables. I usually just run a quick FREQUENCIES command which tells me basically all I need to know.
set tnumbers both tvars both.
*Create frequency tables and bar charts for v1 through v10.
frequencies v1 to v10
The first frequency distribution shows that v1 is positively coded as shown below.
Scrolling down a bit, it appears that v2, v3, v6 and v10 are all negatively coded. I'll reverse the coding for just these 4 variables in the next steps.
2. Copy Value Labels into New Variables
To make sure I'll do everything correctly, I want to compare the new and old values after reversing some items. The best way to do so is using the SPSS Clone Variables Tool. However, plain syntax will do as well: I'll pass the value labels into some new string variables. If this doesn't make any sense yet: read on. It will in a minute.
string s2 s3 s6 s10 (a25).
*Save value labels of v2, v3, v6 and v10 into new strings.
do repeat #new = s2 s3 s6 s10 / #old = v2 v3 v6 v10.
compute #new = valuelabel(#old).
3. Recode and Adjust Value Labels
I'll now fix the problem in 2 steps:
- I'll RECODE values 1, 2, 4 and 5 into 5, 4, 2 and 1. Values 3 (Neutral) and 6 (Don't know) are as desired so I'll skip those.
- I'll adjust the value labels of the recoded values accordingly.
recode v2 v3 v6 v10 (1 = 5)(2 = 4)(4 = 2)(5 = 1).
*Correct value labels for v2, v3, v6 and v10.
add value labels v2 v3 v6 v10 1 "Totally disagree" 2 "Somewhat disagree" 4 "Somewhat Agree" 5 "Totally agree".
4. Check Results
I'm basically done. However, I want to make sure my results are correct and I can do so with some simple CROSSTABS as shown below. If you reversed many many variables, you could loop over these commands with Python but it's not worth the effort for just 4 variables.
crosstabs v6 by s6.
All cases who responded “Neutral” on the original variable -copied into s6- still respond “Neutral” on the reversed variable v6. This holds for all values of all reversed variables. The only thing we changed are the numeric values underlying these answers -exactly as planned.
By recoding into the same variables, we retained all dictionary information such as variable labels, formats and user missing values (still need to be set for these data). On top of that, all variables still have their original names and order.
5. Clean Things Up
Now that we're done, our new string variables are redundant so let's remove them. We'll also set 6 (“Don't know”) as a user missing value.
delete variables s2 to s10.
*Set 6 as user missing value for questionnaire items.
missing values v1 to v10 (6).
Thanks for reading!
THIS TUTORIAL HAS 5 COMMENTS:
By Kevin Garcia on April 13th, 2018
Outstanding piece of work!
Especially the trick with the valuelabel command is a brilliant way to copy variables.
Keep writing more!
By Ruben Geert van den Berg on April 13th, 2018
Hi Kevin, thanks for the compliments! I'll come up with more stuff over the next weeks insofar as I can find the time.
P.s. VALUELABEL is not a command but a function in SPSS.
By Jagdish on April 29th, 2020
I am a beginner in statistics. Please inform me for two questions (Likert Scale) with unequal number of options say question has 5 options 1= Highest to 5= Lowest whereas second question having only three options 1= Citing a reference is very useful , 2= Not sure and 3= Citing a reference is not meaningful. How to do the reverse code in SPSS or JASP step by . step. Thanking you in anticipation.
By Chris on January 7th, 2022
2 silly but very urgent questions about reverse scoring…
1) Where do i put my data (answers of the questionnaire) in the data view? under the original variables or the reversed ones?
For example, suppose we have a likert scale questionnaire from 1 to 5..we reverse score for certain questions (eg. I don’t have a lot of friends=reversed question for the “Extraversion” factor of personality). Let’s say that the participant answers 4 (if we reverse it, the new value is 2)..where do i post the value in the data and which value? a) under the original question (value 4) or under the reversed one? (value 2)…
2) Do i have to put values in the reversed questions as well as new variables? (given that they’ve already been valued prior reversing).
By Ruben Geert van den Berg on January 9th, 2022
The expert advice on this is that you always reverse code your original variables (into themselves) instead of creating any new variables.
So if you've a variable v01 with 1 = Very good ..., then reverse code it into the same variable v01 with 1 = Very bad ...
If you're worried about data integrity, first clone all variables that you'll reverse code as backup copies. Then reverse code the original ones (not the cloned ones).
We've a nice tool for cloning variables under SPSS Clone Variables Tool.
Hope that helps!