Most real world data contain some (or many) missing values. It's always a good idea to inspect the amount of missingness for avoiding unpleasant surprises later on. In order to do so, SPSS has some missing values functions that are mostly used with COMPUTE, IF AND DO IF. This tutorial demonstrates how to use them effectively. We'll do so by using the last 5 variables in hospital.sav.

## Setting User Missing Values

Before discussing SPSS missing values functions, we'll first set 6 as a user missing value for the last 5 variables by running the line of syntax below. missing values doctor_rating to facilities_rating (6).

## SPSS Missing Values Functions

Expression | Meaning | Returns |
---|---|---|

MISSING | Evaluate whether value is system missing or user missing | True or false |

SYSMIS | Evaluate whether value is system missing | True or false |

NMISS | Return number of missing values over variables | Numeric value |

NVALID | Return number of valid values over variables | Numeric value |

## SPSS MISSING Function

**SPSS MISSING function evaluates whether a value is missing** (either a user missing value or a system missing value). For example, we'll flag cases that have a missing value on doctor_rating with the syntax below.If the COMPUTE command puzzles you, see Compute A = B = C for an explanation.

***1. Flag cases having a missing value on doctor_rating.**

compute mis_1 = missing(doctor_rating).

***2. Move flagged cases to top of file.**

sort cases mis_1 (d).

## SPSS SYSMIS Function

**SPSS SYSMIS function evaluates whether a value is system missing**. For example, the syntax below uses IF to replace all system missing values by 99. We'll then label it, specify it as user missing and run a quick check with FREQUENCIES.

***1. Change system missing values to 99.**

if sysmis(doctor_rating) doctor_rating = 99.

***2. Add value label 99.**

add value labels doctor_rating 99 'Recoded system missing value'.

***3. Specify 6 and 99 as user missings.**

missing values doctor_rating (6,99).

***4. Quick check.**

frequencies doctor_rating.

## SPSS NMISS Function

**SPSS NMISS function counts missing values within cases over variables**. Cases with many missing values may be suspicious and you may want to exclude them from analysis with FILTER or SELECT IF. The syntax runs a quick scan for such cases.

***1. Compute variable indicating missings per case.**

compute mis_2 = nmiss(doctor_rating to facilities_rating).

***2. Apply variable label. Tip: indicate number of variables involved here.**

variable labels mis_2 'Number of missing values over doctor_rating to facilities_rating (5 variables)'.

***3. Quick check.**

frequencies mis_2.

## SPSS NVALID Function

**SPSS NVALID function counts the number of valid values over variables**. It is equivalent to the number of variables minus NMISS over those variables. Note that the dot operator is a faster alternative for excluding cases from statistical functions (such as MEAN and SUM).

***Count valid values over doctor_rating to facilities_rating (5 variables).**

compute valid_1 = nvalid(doctor_rating to facilities_rating).

exe.

## THIS TUTORIAL HAS 7 COMMENTS:

## By rudy g. perez on December 3rd, 2014

very good presentations.

## By Dan on November 1st, 2017

Very informative on missing values in spss,thank you.

## By Carrie Petrucci on April 29th, 2018

Clearest description of how to handle missings and non-missings in SPSS that I've come across. Thanks so much.

## By Nick on November 25th, 2019

Thank you for this info. How would I go about deleting cases with only a certain percentage of missing data? For instance, I have a dataset with 50 items (totaling 8 variables). I don’t want to keep participants with less than 50% of missing data.

## By Ruben Geert van den Berg on November 25th, 2019

Hi Nick!

Let's say you've 8 variables, v01 to v08. You can delete cases with 4 or more missings in one go:

SELECT IF(NMISS(v01 to v08) < 4). EXECUTE. which means "delete all cases except those with fewer than 4 missing values on v01 to v08". But a safter option is to first create a helper variable and inspect it: COMPUTE MIS_1 = NMISS(v01 to v08). VARIABLE LABELS MIS_1 "Number of missing values over v01 to v08". FREQUENCIES MIS_1. And then delete unneeded cases with SELECT IF(MIS_1 < 4). EXECUTE. The frequency table tells you how many cases you'll have left. So you'll make a better informed decision than when "blindly" deleting an unknown number of cases with 4 or more missings. Hope that helps! SPSS tutorials