In SPSS, SELECT IF permanently removes
a selection of cases (rows) from your data.
- Example 1 - Selection for 1 Variable
- Example 2 - Selection for 2 Variables
- Example 3 - Selection for (Non) Missing Values
- Tip 1 - Inspect Selection Before Deletion
- Tip 2 - Use TEMPORARY
Summary
SELECT IF in SPSS basically means “delete all cases that don't satisfy one or more conditions”. Like so, select if(gender = 'female'). permanently deletes all cases whose gender is not female. Let's now walk through some real world examples using bank_clean.sav, partly shown below.

Example 1 - Selection for 1 Variable
Let's first delete all cases who don't have at least a Bachelor's degree. The syntax below:
- inspects the frequency distribution for education level;
- deletes unneeded cases;
- inspects the results.
set tnumbers both.
*Run minimal frequencies table.
frequencies educ.
*Select cases with a Bachelor's degree or higher. Delete all other cases.
select if(educ >= 4).
*Reinspect frequencies.
frequencies educ.
Result

As we see, our data now only contain cases having a Bachelor's, Master's or PhD degree. Importantly, cases having
on education level have been removed from the data as well.
Example 2 - Selection for 2 Variables
The syntax below selects cases based on gender and education level: we'll keep only female respondents having at least a Bachelor's degree in our data.
crosstabs educ by gender.
*Select females having a Bachelor's degree or higher.
select if(gender = 0 & educ >= 4).
*Reinspect contingency table.
crosstabs educ by gender.
Result

Example 3 - Selection for (Non) Missing Values
Selections based on (non) missing values are straightforward if you master SPSS Missing Values Functions. For example, the syntax below shows 2 options for deleting cases having fewer than 7 valid values on the last 10 variables (overall to q9).
select if(nvalid(overall to q9) >= 7)./*At least 7 valid values or at most 3 missings.
execute.
*Alternative way, exact same result.
select if(nmiss(overall to q9) < 4)./*Fewer than 4 missings or more than 6 valid values.
execute.
Tip 1 - Inspect Selection Before Deletion
Before deleting cases, I sometimes want to have a quick look at them. A good way for doing so is creating a FILTER variable. The syntax below shows the right way for doing so.
compute filt_1 = 0.
*Set filter variable to 1 for cases we want to keep in data.
if(nvalid(overall to q9) >= 7) filt_1 = 1.
*Move unselected cases to bottom of dataset.
sort cases by filt_1 (d).
*Scroll to bottom of dataset now. Note that cases 459 - 464 will be deleted because they have 0 on filt_1.
*If selection as desired, delete other cases.
select if(filt_1).
execute.
Quick note: select if(filt_1). is a shorthand for select if(filt_1 <> 0). and deletes cases having either a zero or a missing value on filt_1.
Result

Tip 2 - Use TEMPORARY
A final tip I want to mention is combining SELECT IF with TEMPORARY. By doing so, SELECT IF only applies to the first procedure that follows it. For a quick example, compare the results of the first and second FREQUENCIES commands below.
temporary.
*Select only female cases.
select if(gender = 0).
*Any procedure now uses only female cases. This also reverses case selection.
frequencies gender educ.
*Rerunning frequencies now uses all cases in data again.
frequencies gender educ.
Final Notes
First off, parentheses around conditions in syntax are not required. Therefore, select if(gender = 0). can also be written as select if gender = 0. I used to think that shorter syntax is always better but I changed my mind over the years. Readability and clear structure are important too. I therefore use (and recommend) parentheses around conditions. This also goes for IF and DO IF.
Right, I guess that should do. Did I miss anything? Please let me know by throwing a comment below.
Thanks for reading!
THIS TUTORIAL HAS 29 COMMENTS:
By Kiran Acharya on August 16th, 2017
How to analyse frequency using weighted data
By Ruben Geert van den Berg on August 16th, 2017
Hi Kiran, what do you want to know?
Just switch on the correct frequency (weight) variable and proceed as usual. Right?
By Nik on May 18th, 2018
Hi! Thank you for this tutorial. I have a question. If I write a~=b, SPSS select every cases (with a number) of a range where a is from b. My problem is that I need that SPSS select also missing values. I mean, no just numbers but also blank cell. How could I do it? Thank you so much!
By Ruben Geert van den Berg on May 18th, 2018
Hi Nik!
No, that's not correct. a ~= b is true if a is not b. Simple as that. It's similar to a <> b or a ne b (ne is short for "not equal" but please avoid this operator).
For a numeric variable N1, try something like
SELECT IF (N1 ~= b & NOT MISSING(N1)).
Note that MISSING is a function that returns true for system or user missing values. Alternatively,
SELECT IF (not any(N1,b,$sysmis)).
deletes all cases having b or a system missing value on N1. Note that $sysmis is a system variable.Neither method works for string variables because empty cells are not missing by default.
P.s. quick tip: first COMPUTE a selection variable and inspect it in data view. Only if it's correct, then run SELECT IF using the selection variable. Like so you won't delete any cases before you're sure you made the right selection.
Hope that helps!
By Renate on May 19th, 2018
Hello,
How can I combine two conditions?
SELECT IF
v_233 = 3
Feedb ~='-99'.
I want to keep all cases that have v_233= 3 and also those that have v_233 not equal to 3 but Feedb has some text.