Introduction & Practice Data File
This tutorial shows how to compute means over both variables and cases in a simple but solid way. We encourage you follow along by downloading and opening restaurant.sav, part of which is shown below.
Quick Data Check
Before computing anything whatsoever, we always need to know what's in our data in the first place. Skipping this step often results in ending up with wrong results as we'll see in a minute. Let's first inspect some frequencies by running the syntax below.
set tnumbers both.
*Quick data check.
frequencies v1 to v5.
Result
Right, now there's two things we need to ensure before proceeding. Firstly, do all variables have similar coding schemes? For the food rating, higher numbers (4 or 5) reflect more positive attitudes (“Good” and “Very good”) but does this hold for all variables? If we take a quick peek at our 5 tables, we see this holds.
Second, do we have any user missing values? That is, do we want to include all data values in our computations? In this case, we don't. We need to exclude 6 (“No answer”) from all computations. We'll do so with the syntax below.
Setting Missing Values
missing values v1 to v5 (6).
*Check again.
frequencies v1 to v5.
Result
Computing Means over Variables
Right, the simplest way for computing means over variables is shown in the syntax below. Note that we can usually specify variable names separated by spaces but for some odd reason we need to use commas in this case.
compute happy1 = mean(v1, v2, v3, v4, v5).
execute.
If our target variables are adjacent in our data, we don't need to spell out all variable names. Instead, we'll enter only the first and last variable names (which can be copy-pasted from variable view into our syntax window) separated by TO.
compute happy2 = mean(v1 to v5).
execute.
Computing Means - Dealing with Missing Values
If we take a good look at our data, we see that some respondents have a lot of missing values on v1 to v5. By default, the mean over v1 to v5 is computed for any case who has at least one none missing value on those variables. If all five values are (system or user) missing, a mean can't be computed so it will be a system missing value as we see in our data.
It's quite common to exclude cases with many missings from computations. In this case, the easiest option is using the dot operator. For example mean.3(v1 to v5)
means “compute means over v1 to v5 but only for cases having at least 3 non missing values on those variables”. Let's try it.
Computing Means - Exclude Cases with Many Missings
compute happy3 = mean.3(v1 to v5).
execute.
Result
A more general way that'll work for more complex computations as well is by using IF as shown below.
if (nvalid (v1 to v5) >= 3) happy4 = mean(v1 to v5).
execute.
SPSS - Compute Means over Cases
So far we computed horizontal means: means over variables for each case separately. Let's now compute vertical means: means over cases for each variable separately. We'll first create output tables with means and we'll then add such means to our data.
Means over all cases are easily obtained with DESCRIPTIVES as in
descriptives v1 v2.
Result
Means for Groups Separately
So what if we want means for male and female respondents separately? One option is SPLIT FILE but this is way more work than necessary. A simple MEANS command will do as shown below.
set tnumbers labels.
*Report means for genders separately.
means v1 v2 by gender/cells means.
Result
SPSS - Add Means to Dataset
Finally, you may sometimes want means over cases as new variables in your data. The way to go here is AGGREGATE as shown below.
aggregate outfile * mode addvariables
/mean_1 = mean(v1).
If you'd like means for groups of cases separately, add one or more BREAK
variables as shown below. This example also shows how to add means for multiple variables in one go, again by using TO.
aggregate outfile * mode addvariables
/break gender
/mean_2 to mean_5 = mean(v2 to v5).
Result
Note that we already saw these means (over v2, for genders separately) in our output after running
means v2 by gender.
Right. That's about all we could think of regarding means in SPSS. If you've any questions or remarks, please feel free to throw in a comment below.
THIS TUTORIAL HAS 11 COMMENTS:
By Ruben Geert van den Berg on October 29th, 2018
Use
MEAN()
for means andSD()
for standard deviations. Don't use the + operator.Also see SPSS Missing Values Tutorial for the reasons for this.
HTH!
By William Peck on November 5th, 2018
I got an error with the last two exercises on the Aggregate commands, ">Error # 10910. Command name: aggregate. The definition of a new variable on the AGGREGATE command specifies a new variable name that duplicates the name of an existing variable. To replace existing variables, specify OVERWRITE = YES in the OUTFILE subcommand."
So I added this to the end of the command and it worked great! :-)
aggregate outfile * mode addvariables overwrite=YES
/mean_1 = mean(v1).
By Muhammad Abu-salih on March 7th, 2019
I need to try this tutorial to find its usefulness, but computation of means through the usual commands of SPSS are very easy.
By Eugenia ML on June 12th, 2019
You are so clear and precise! It´s a real help!
Thank you so much!
By Morthen on May 31st, 2022
Hi!
I have used this command for means of sets with missing points:
Mean4 (v1,v2,v3,v4,v5,v6,v7,v8)
to get a mean of whichever cases have more than 4 valid values.
As you can see, I am missing a "." between "mean" and "4".
I got slightly different mean values with this command, and now I may have to redo an entire analysis. Do you know what difference it makes?