# SPSS tutorials

BASICS REGRESSION T-TEST ANOVA CORRELATION

# SPSS IF – A Quick Tutorial

In SPSS, IF is a conditional COMPUTE command. It calculates a (possibly new) variable but only for those cases that satisfy some condition(s). This tutorial walks you through some typical examples of the IF command.

## Example 1 - Replace Missing Values

With the syntax below we'll first create some test data. Next we'll set the existing variable score to 100 for all respondents (only one in this case) having a missing value on score. An alternative here is RECODE score (missing = 100). The effect becomes visible after sorting the cases in a more conventient way.This is because IF is technically a transformation.

## SPSS IF Syntax Example 1

*1. Create test data.

data list free/gender score.
begin data
0 80 1 85 0 90 1 95 0 '' 1 105 0 110 1 115
end data.

*2. Replace missing value with 100.

if missing(score) score = 100.

*3. Sort cases.

sort cases gender.

## Example 2 - Score Groups

Next, we'll create score groups. Respondents scoring under 100 points get a 1 (‘low score’). The others get a 2 (‘high score’). We'll demonstrate three ways to do so. The third may seem a little weird. It's explained in Compute A = B = C.

## SPSS IF Syntax Example 2

*1. Create score groups option 1.

if score lt 100 group_a = 1.
if score ge 100 group_a = 2.
exe.

*2. Create score groups option 2.

recode score (100 thru hi = 2) (else = 1) into group_b.
exe.

*3. Create score groups option 3.

compute group_c = (score ge 100) + 1.
exe.

## Example 3 - Gender-Score Groups

Now we'll create score groups for female and male respondents separately. At this point we can't use a simple RECODE anymore. This is because the conditions now involve two variables, gender and score. A simple approach here is using four IF statements. Each holds two conditions (gender and score). A faster but more difficult equivalent here is a single COMPUTE command.

## SPSS IF Syntax Example 3

*1. Gender-score groups option 1.

if score lt 100 and gender eq 0 group_d = 1.
if score ge 100 and gender eq 0 group_d = 2.
if score lt 100 and gender eq 1 group_d = 3.
if score ge 100 and gender eq 1 group_d = 4.
exe.

*2. Gender-score groups option 2.

compute group_e = 2 * gender + (score ge 100) + 1.
exe.

## Difference Between IF and DO IF

Very similar to the IF commands we showed is DO IF-ELSE IF-END IF. Apart from the latter usually requiring more syntax, there's an important difference between the two. This occurs when conditions are not mutually exclusive. This means that a single case may satisfy two or more conditions simultaneously. In this case, the following happens

• With IF the last condition that holds prevails. Since IF statements are completely separate commands, later ones simply overwrite the results of previous ones.
• With DO IF-ELSE IF-END IF the first condition that holds prevails. The trick is in ELSE IF. The “ELSE” here means “if the preceding condition(s) don't hold, only then...

The final syntax example demonstrates this difference between IF and DO IF-ELSE IF-END IF.

## SPSS IF Syntax Example 4

*1. Three score groups with DO-IF.

compute group_f = 1.
do if score ge 100.
compute group_f = 3.
else if score ge 90.
compute group_f = 2.
end if.

*2. Sort cases.

sort cases score.

*3. Equivalent IF statements don't work.

compute group_g = 1.
if score ge 100 group_g = 3.
if score ge 90 group_g = 2.
exe.

# Let me know what you think!

*Required field. Your comment will show up after approval from a moderator.

# This tutorial has 36 comments

• ### By Ruben Geert van den Berg on January 23rd, 2017

Hi Toria!

You can't recode 2 variables into one. Your easiest option may be IF with multipe conditions as in

```compute outcomevariable = 0. if(gender = 1 and score >= 12) outcomevariable = 1. if(gender = 2 and score >= 14) outcomevariable = 1. execute.```

If one condition occurs in many such statements, you can write cleaner (thus better) syntax with DO IF as in

```do if (gender = 0). recode score (0 thru 11 = 0)(12 thru 25 = 1) into outcomevariable. *possibly more statements here for (gender = 0).... else if (gender = 1). recode score (0 thru 14 = 0)(15 thru 25 = 1) into outcomevariable. *possibly more statements here for (gender = 1).... end if.```

Hope that helps!

• ### By Toria Vi on January 23rd, 2017

Hi Ruben, I'm a phd student & I find your website very informative, easy to read and succinct. Here's my question:
I need to create a dichotomous variable yes/no based on whether cases meet a particular cut-off score on a subscaleae. Now, that is simple enough with a recode into a different variable. The problem I'm having is the cut-off score varies based on the cases' gender/age. I.E: if the case is a boy aged =12, or if case is girl =14 etc. All recoded into the same variable. Thank you.

• ### By Ruben Geert van den Berg on December 16th, 2016

Hi Alberto! This sounds like a serious challenge. Could you perhaps send me a sample of these data by email? I'm off today but perhaps I can try and write the desired syntax on Saturday/Sunday.

• ### By ALBERTO ZUCCHI on December 14th, 2016

Hello again, Ruben!
I have an unusual problem (for what is my knowledge, at least), and I'm sure you can give me substantial help.
I have a sequence of data, concerning patients undergoing rehabilitation procedures, like this:
ID Var1 Data1 Data2 .... Flag
where ID is the unique ID for each patient. I have 1 to n possible IDs as cases (real facts, 1 to 21 IDs).
Var1 is defining the hospital kind of take-in-charge (ie, non rehab or rehab).
Data1 is day of hospital take-in charge, and data2 is day of hospital discharge.
I need to know, for each ID's rehab take-in charge,, if there is an antecedent non-reahab take in charge comprised in a time span of 7 days, and so flag it with a dummy variable (flag).
By example:
ID Var1 data1 data2
1 non rehab 02/15/2016 02/20/2016
1 rehab 02/23/2016 02/26/2016
1 rehab 04/05/2016 04/10/2016
In this case, the second record of patient n. 1 must be flagged as, by example, 1 (ie there are less than 7 days between the discharge date of first antecedent non rehab take-in charge of a rehab take-in charge).
Of course, first record of ID1 will have a flag equal to 0, and so the 3rd take-in charge (being anteceded by another rehab take-in charge).
I've tried to use restructurate command, to perform on single Ids put on single row all operations, but problem is that I have IDs that have till 21 take in charges,so the new dataset is multiplying variables in a too wide excess. Any suggestion, Ruben?

Many thanks!

Alberto Zucchi

• ### By Ruben Geert van den Berg on December 5th, 2016

Hi Laura! The right way to do this, is set 0 as a user missing value
with something like `missing values v1 to v4 (0).` Then use COMPUTE with the MEAN function: `compute somemean = mean(v1 to v4).`

Try and run the syntax below step by step. It creates a handful of test cases. Means are computed only over nonzero values as you can easily see. Hope that helps!

*1. Create empty cases.

data list free/id.
begin data
1 2 3 4 5 6 7 8 9 10
end data.

*2. Create 4 test variables holding values from 0 through 3.

set seed 1.

do repeat #new = v1 to v4.
compute #new = rv.binom(3,0.5).
end repeat.
execute.

*3. Set zero as missing value.

missing values v1 to v4 (0).

*4. Compute mean over only nonzero values.

compute mymean = mean(v1 to v4).
execute.