SPSS TUTORIALS BASICS ANOVA REGRESSION FACTOR CORRELATION

# SPSS SELECT IF – Tutorial & Examples

In SPSS, SELECT IF permanently removes
a selection of cases (rows) from your data.

## Summary

SELECT IF in SPSS basically means “delete all cases that don't satisfy one or more conditions”. Like so, select if(gender = 'female'). permanently deletes all cases whose gender is not female. Let's now walk through some real world examples using bank_clean.sav, partly shown below.

## Example 1 - Selection for 1 Variable

Let's first delete all cases who don't have at least a Bachelor's degree. The syntax below:

• inspects the frequency distribution for education level;
• deletes unneeded cases;
• inspects the results.
*Show values and value labels in new output tables.

set tnumbers both.

*Run minimal frequencies table.

frequencies educ.

*Select cases with a Bachelor's degree or higher. Delete all other cases.

select if(educ >= 4).

*Reinspect frequencies.

frequencies educ.

## Result

As we see, our data now only contain cases having a Bachelor's, Master's or PhD degree. Importantly, cases having

on education level have been removed from the data as well.

## Example 2 - Selection for 2 Variables

The syntax below selects cases based on gender and education level: we'll keep only female respondents having at least a Bachelor's degree in our data.

*Inspect contingency table sex and education.

crosstabs educ by gender.

*Select females having a Bachelor's degree or higher.

select if(gender = 0 & educ >= 4).

*Reinspect contingency table.

crosstabs educ by gender.

## Example 3 - Selection for (Non) Missing Values

Selections based on (non) missing values are straightforward if you master SPSS Missing Values Functions. For example, the syntax below shows 2 options for deleting cases having fewer than 7 valid values on the last 10 variables (overall to q9).

*Select cases having at least 7 non missing values out of last 10 questions.

select if(nvalid(overall to q9) >= 7)./*At least 7 valid values or at most 3 missings.
execute.

*Alternative way, exact same result.

select if(nmiss(overall to q9) < 4)./*Fewer than 4 missings or more than 6 valid values.
execute.

## Tip 1 - Inspect Selection Before Deletion

Before deleting cases, I sometimes want to have a quick look at them. A good way for doing so is creating a FILTER variable. The syntax below shows the right way for doing so.

*Create filter variable holding only zeroes.

compute filt_1 = 0.

*Set filter variable to 1 for cases we want to keep in data.

if(nvalid(overall to q9) >= 7) filt_1 = 1.

*Move unselected cases to bottom of dataset.

sort cases by filt_1 (d).

*Scroll to bottom of dataset now. Note that cases 459 - 464 will be deleted because they have 0 on filt_1.

*If selection as desired, delete other cases.

select if(filt_1).
execute.

Quick note: select if(filt_1). is a shorthand for select if(filt_1 <> 0). and deletes cases having either a zero or a missing value on filt_1.

## Result

Cases that will be deleted are at the bottom of our data. We also readily see we'll have 458 cases left after doing so.

## Tip 2 - Use TEMPORARY

A final tip I want to mention is combining SELECT IF with TEMPORARY. By doing so, SELECT IF only applies to the first procedure that follows it. For a quick example, compare the results of the first and second FREQUENCIES commands below.

*Make sure case deletion only applies to first procedure.

temporary.

*Select only female cases.

select if(gender = 0).

*Any procedure now uses only female cases. This also reverses case selection.

frequencies gender educ.

*Rerunning frequencies now uses all cases in data again.

frequencies gender educ.

## Final Notes

First off, parentheses around conditions in syntax are not required. Therefore, select if(gender = 0). can also be written as select if gender = 0. I used to think that shorter syntax is always better but I changed my mind over the years. Readability and clear structure are important too. I therefore use (and recommend) parentheses around conditions. This also goes for IF and DO IF.

Right, I guess that should do. Did I miss anything? Please let me know by throwing a comment below.

# Tell us what you think!

*Required field. Your comment will show up after approval from a moderator.

# THIS TUTORIAL HAS 29 COMMENTS:

• ### By Vincent on May 13th, 2016

I noticed that in my syntax the code only works if there are no spations in between, like:

SELECT IF End=1.

• ### By Ruben Geert van den Berg on May 13th, 2016

Hi Vincent!

What happens if you do throw in a bunch of spaces? It shouldn't make any difference as in:

```data list free/id. begin data 1 2 3 4 5 end data.```

``` ```

```select if id       =        1. execute.```

good

• ### By Martin on September 17th, 2016

Hello Ruben,

I am a Statistics graduate. I think all your SPSS tutorials are very good, I like them a lot. Thank you !

Sincerely
Martin

• ### By Md. Uzzal Chowdhury on November 12th, 2016

need to know more SPSS