A hospital wants to know how a homeopathic medicine for depression performs in comparison to alternatives. They adminstered 4 treatments to 100 patients for 2 weeks and then measured their depression levels. The data, part of which are shown above, are in depression.sav.

## Data Inspection - Split Histogram

Before running any statistical tests, let's first just take a look at our data. In this case, a split histogram basically tells the whole story in a single chart. We don't see many SPSS users run such charts but you'll see in a minute how incredibly useful it is. The screenshots below show how to create it.

In step below, you can add a nice title to your chart. We settled for “Distribution BDI per Medicine”.

## Syntax for Split Histogram

Clicking

results in the syntax below. Running it generates our chart.***Run histogram of BDI scores for the four treatments separately.**

GRAPH

/HISTOGRAM=bdi

/PANEL ROWVAR=medicine ROWOP=CROSS

/TITLE='Distribution BDI per Medicine'.

## Result

- All
**distributions look plausible**. We don't see very low or high BDI scores that should be set as missing values and the BDI scores even look reasonably normally distributed. - The medicine
**“None” results in the highest BDI scores**, indicating the worst depressive symptoms. “Pharmaceutical” results in the lowest levels of depressive illness and the other two treatments are in between. - The four histograms are roughly equally wide, suggesting BDI scores have
**roughly equal variances**over our four medicines.

## Means Table

We'll now take a more precise look at our data by running a means table with the syntax below.

***Run basic means table.**

means bdi by medicine/cells count min max mean variance.

## Result

Unsurprisingly, our table mostly confirms what we already saw in our histogram. Note (under “N”) that each medicine has 25 observations so these two variables don't contain any missing values.

So can we conclude that “Pharmaceutical” performs best and “None” performs worst? Well, for our sample we can. For our population (all people suffering from depression) we can't.

The basic problem is that **samples differ from the populations from which they are drawn**. If our four medicines perform equally well in our population, then we may still see some differences between our sample means. However, *large* sample differences are unlikely if all medicines perform equally in our population. This basic reasoning is explained further in ANOVA - What Is It?.

The question we'll now answer is: **are the sample means different enough** to reject the hypothesis that the mean BDI scores in our population are equal?

## ANOVA Basics

We'll try to demonstrate that some medicines perform better than others by rejecting the **null hypothesis** that the mean BDI scores for our four medicines are all equal in our population. In short, our ANOVA tests whether all 4 means are equal. If they aren't then we'd like to know exactly which means are unequal with post hoc (Latin for “after that”) tests.

Our ANOVA will run fine in SPSS but in order to have confidence in its results, we need to satisfy some assumptions.

## ANOVA - Main Assumptions

**Independent observations**often holds if each case (row of cells in SPSS) represents a unique person or other statistical unit. That is, we usually don't want more than one row of data for one person, which holds for our data;**Normally distributed variables**in the population seems reasonable if we look at the histograms we inspected earlier. Besideds, violation of the normality assumption is no real issue for larger sample sizes due to the central limit theorem.**Homogeneity**means that the population variances of BDI in each medicine group are all equal, reflected in roughly equal sample variances. Again, our split histogram suggests this is the case but we'll try and confirm this by including**Levene's test**when running our ANOVA.

## Running our ANOVA in SPSS

There's many ways to run the exact same ANOVA in SPSS. Today, we'll go for partial eta squared as an estimate for the effect size of our model.

because it'll provide us withWe'll briefly jump into

and before pasting our syntax.The post hoc test we'll run is Tukey’s HSD (Honestly Significant Difference), denoted as “Tukey”. We'll explain how it works when we'll discuss the output.

“Estimates of effect size” refers to partial eta squared. “Homogeneity tests” includes Levene’s test for equal variances in our output.

## Post Hoc ANOVA Syntax

Following the previous screenshots results in the syntax below. We'll run it and explain the output.

***ANOVA syntax with Post Hoc (Tukey) test, Homoscedasticity (Levene's test) and effect size (partial eta squared).**

UNIANOVA bdi BY medicine

/METHOD=SSTYPE(3)

/INTERCEPT=INCLUDE

/POSTHOC=medicine(TUKEY)

/PRINT=ETASQ HOMOGENEITY

/CRITERIA=ALPHA(.05)

/DESIGN=medicine.

## SPSS ANOVA Output - Levene’s Test

Levene’s Test checks whether the *population* variances of BDI for the four medicine groups are all equal, which is a requirement for ANOVA. “Sig.” = 0.949 so there's a 94.9% probability of finding the slightly different variances that we see in our sample. This sample outcome is very likely under the null hypothesis of homoscedasticity; we satisfy this assumption for our ANOVA.

## SPSS ANOVA Output - Between Subjects Effects

If our population means are really equal, there's a 0% chance of finding the sample differences we observed. **We reject the null hypothesis of equal population means**.

The different medicines administered account for some 39% of the variance in the BDI scores. This is the **effect size** as indicated by partial eta squared.

Partial Eta Squared is the Sums of Squares for medicine divided by the corrected total sums of squares (2780 / 7071 = 0.39).

Sums of Squares Error represents the variance in BDI scores not accounted for by medicine. Note that + = .

## SPSS ANOVA Output - Multiple Comparisons

So far, we only concluded that all four means being equal is very unlikely. So exactly **which mean differs from which mean?** Well, the histograms and means tables we ran before our ANOVA point us in the right direction but we'll try and back that up with a more formal test: Tukey’s HSD as shown in the multiple comparisons table.

Right, now comparing 4 means results in (4 - 1) x 4 x 0.5 = 6 distinct comparisons, each of which is listed twice in this table. There's three ways for telling which means are likely to be different:

Statistically significant mean differences are **flagged** with an asterisk (*). For instance, the very first line tells us that “None” has a mean BDI score of 6.7 points higher than the placebo -which is quite a lot actually since BDI scores can range from 0 through 63.

As a rule of thumb, **“Sig.” < 0.05 indicates a statistically significant difference** between two means.

A confidence interval *not* including zero means that a zero difference between these means in the population is unlikely.

Obviously, , and result in the same conclusions.

So that's it for now. I hope this tutorial helps you to run ANOVA with post hoc tests confidently. If you have any suggestions, please let me know by leaving a comment below.

## This tutorial has 25 comments

## By Ruben Geert van den Berg on October 23rd, 2016

Thanks for the comment!

However, when writing new tutorials, we can't take into account all possible scenarios because the tutorials would become way too long for the average reader. Most of our visitors want a fast and simple answer to their question and this obviously conflicts with our attempts to deliver accurate and complete information. Deciding what to include or exclude in each tutorial is a difficult decision, it's hard to find the right balance.

That being said, you probably don't want to include any very small group(s) anyway in your ANOVA: small n's are associated with large standard errors so your estimate of a group mean on n = 2 or 3 cases is very unstable. Either FILTER out this group or merge it with other small groups (see RECODE) and label that new category as "group = Other" or something.

Hope that helps!

## By ThankGod on October 23rd, 2016

this tutorial is very helpful! but i wish the tutorial took into accout some errors encounted in performing the analysis. for instance, when performing ANOVA and post hoc test you might receive warning of fewer cases than two on one group as such the post hoc and anova cant be performed. thanks.

## By MrMystic on October 11th, 2016

Hi, neat guide! I have a question: I want to do an ANOVA with post hoc and then represent my results (compared means) in a diagram with 95% confidence interval bars. Is there a way to do it with SPSS?

Because I have read somewhere that the 95% CIs from Tukey represent the difference between the compared means (as you say significant difference if CI doesnt contain zero). How do I get the CIs I need?

Thanks!

## By Ruben Geert van den Berg on October 9th, 2016

Hi Elvira!

ANOVA always has one dependent (outcome) variable. This is what's GLM, Univariate is for. If there's more than 1, ANOVA becomes MANOVA (multivariate ANOVA). That'll be GLM, Multivariate.

"One-way" means there's one independent variable. With 2 independent variables, we have a two-way ANOVA. And so on.

Don't combine 2 or more factors (variables) into a single variable for ANOVA, it doesn't make any sense whatsoever.

You could use a FILTER for excluding one or more levels of a factor from the analysis. Or alternatively, combine levels (these are values or categories) into one with RECODE.

The most common scenario is having some categories with very few observations in them. You could combine those into a new category labeled something like "Other".

Last, if you find ANOVA difficult, read up on ANOVA - What Is It? and perhaps SPSS One-Way ANOVA.

Hope that helps!

## By Elvira on October 8th, 2016

If we go General Linear Model - Multivariate, is it one-way anova for multiple variables?

Another question, is there a way to combine data within one fixed factor and run analysis for combined data in SPSS? Or run analysis for part of the fixed factor? Sorry if questions are stupid, I am not good at statistics.

Thank you for the tutorial. It is very helpful for me.