SPSS transformations between DO IF ... and END IF are applied only to cases (rows of data) that satisfy one or more conditions. In many cases, IF is a faster way to accomplish the same results.
SPSS Do If Example
Say we'd like to convert people's monthly income into income classes. We may want to use different cut off points for male and female respondents. In this case, we can first use a RECODE command only for cases whose gender is female. Next, we'll use a different RECODE command for males. The syntax below demonstrates this, using employees.sav.
SPSS Do If Syntax Example 1
cd 'd:downloaded'./*Or wherever "employees.sav" is located.
get file 'employees.sav'.
*2. Recode command restricted to female respondents.
do if gender = 0.
recode monthly_income (lo thru 2000 = 1)(lo thru 2400 = 2)(lo thru hi = 3) into income_class.
end if.
*3. Recode command restricted to male respondents.
do if gender = 1.
recode monthly_income (lo thru 2500 = 1)(lo thru 3000 = 2)(lo thru hi = 3) into income_class.
end if.
*3. Inspect result.
crosstabs income_class by gender.
Else If
Although the previous syntax does its job, there's a shorter way to accomplish the exact same result: ELSE IF ... This means that the commands that follow are carried out only for cases who 1) satisfy the current condition(s) and 2) don't satisfy any of the previous conditions. The syntax below shows how to use it.
SPSS Do If Syntax Example 2
do if gender = 0.
recode monthly_income (lo thru 2000 = 1)(lo thru 2400 = 2)(lo thru hi = 3) into income_class.
else if gender = 1.
recode monthly_income (lo thru 2500 = 1)(lo thru 3000 = 2)(lo thru hi = 3) into income_class.
end if.
Else
In a similar vein to ELSE IF ... commands that follow ELSE are carried out for all cases who don't satisfy any of the previous conditions. An important thing to notice here is that ELSE does not include cases for whom previous conditions could not be evaluated due to missing values.
The final syntax example demonstrates this by creating a birth decennium variable using XDATE. Next, FREQUENCIES confirms that respondents whose birthday is unknown are not assigned to any birth decennium.
SPSS Do If Syntax Example 3
do if xdate.year(date_of_birth) lt 1960.
compute birth_decennium = 1.
else if xdate.year(date_of_birth) lt 1970.
compute birth_decennium = 2.
else if xdate.year(date_of_birth) lt 1980.
compute birth_decennium = 3.
else.
compute birth_decennium = 4.
end if.
*2. Apply value labels to birth_decennium.
value labels birth_decennium 1 '50''s' 2 '60''s' 3 '70''s' 4 '80''s'.
*3. Inspect frequency distribution for birth_decennium.
frequencies birth_decennium.
Note
Only SPSS transformation commands can be used within DO IF. These exclude most commands that generate output such as FREQUENCIES and DESCRIPTIVES. For using such commands on subsets of cases, see FILTER, SPLIT FILE and SELECT IF.
THIS TUTORIAL HAS 16 COMMENTS:
By Janice on November 3rd, 2015
I can't download the employees.sav
By Ruben Geert van den Berg on November 3rd, 2015
Thanks for your comment! I'm kinda surprised, I just tried the link in this tutorial and it worked just fine. I'll send you the file by email straight away. Hope that helps!
By Simon Mbai on May 28th, 2018
Quite informative. Can I get around 20 of the most used commands in SPSS?
Regards
By Niels on January 23rd, 2019
Hi Ruben,
I have a question. You mention "In many cases, IF is a faster way to accomplish the same results." Do you say this because the syntax is shorter, or is the IF command also computationally faster?
In particular, I'm picturing a situation where you want to compute at least 2 different values (each with their own condition). In this case two IF commands would be needed, and two passes through all the rows, whereas one DO IF, ELSE IF would suffice (with just one pass through the rows?).
Is my thinking here correct, assuming the DO IF is faster in such cases, or is the IF command optimized somehow to still be faster? Would be great to hear your thoughts on this! Thanks in advance!
By Ruben Geert van den Berg on January 24th, 2019
Hi Niels, nice to hear from you!
I meant that IF is faster because it's just one line as opposed to at least 3 lines as in
DO IF...
COMPUTE/RECODE/COUNT...
END IF.
Computationally, it won't make any difference. Since IF is a transformation command, multiple IF commands without anything between them require only 1 data pass. Just don't add a redundant EXECUTE after each IF statement -that's a really bad idea!
I should add that data passes only take long if you've a huge number of cases in your active dataset. For smaller data, redundant data passes don't make a noticeable difference.
Hope that helps!