 SPSS TUTORIALS BASICS ANOVA REGRESSION FACTOR CORRELATION

# SPSS ANOVA with Post Hoc Tests

Post hoc tests in ANOVA test if the difference between
each possible pair of means is statistically significant.
This tutorial walks you through running and understanding post hoc tests using depression.sav, partly shown below. The variables we'll use are the medicine that our participants were randomly assigned to and their levels of depressiveness measured 16 weeks after starting medication.

Our research question is whether some medicines result in lower depression scores than others. A better analysis here would have been ANCOVA but -sadly- no depression pretest was administered.

## Quick Data Check

Before jumping blindly into any analyses, let's first see if our data look plausible in the first place. A good first step is inspecting a histogram which I'll run from the SPSS syntax below.

*QUICK CHECK DEPENDENT VARIABLE.

frequencies bdi
/format notable
/histogram.

## Result

First off, our histogram (shown below) doesn't show anything surprising or alarming. Also, note that N = 100 so this variable does not have any missing values. Finally, it could be argued that a single participant near 15 points could be an outlier. It doesn't look too bad so we'll just leave it for now.

## Descriptive Statistics for Subgroups

Let's now run some descriptive statistics for each medicine group separately. The right way for doing so is from Analyze Compare Means Means or simply typing the 2 lines of syntax shown below.

*DESCRIPTIVE STATISTICS FOR SUBGROUPS OF CASES.

means bdi by medicine
/cells count min max mean median stddev skew kurt.

## Result As shown, I like to present a nicely detailed table including the

for each group separately. But most important here are the sample sizes because these affect which assumptions we'll need for our ANOVA.

Also note that the mean depression scores are quite different across medicines. However, these are based on rather small samples. So the big question is: what can we conclude about the entire populations? That is: all people who'll take these medicines?

## ANOVA - Null Hypothesis

In short, our ANOVA tries to demonstrate that some medicines work better than others by nullifying the opposite claim. This null hypothesis states that the population mean depression scores are equal
across all medicines.
An ANOVA will tell us if this is credible, given the sample data we're analyzing. However, these data must meet a couple of assumptions in order to trust the ANOVA results.

## ANOVA - Assumptions

ANOVA requires the following assumptions:

• independent observations;
• normality: the dependent variable must be normally distributed within each subpopulation we're comparing. However, normality is not needed if each n > 25 or so.
• homogeneity: the variance of the dependent variable must be equal across all subpopulations we're comparing. However, homogeneity is not needed if all sample sizes are roughly equal.

Now, homogeneity is only required for sharply unequal sample sizes. In this case, Levene's test can be used to examine if homogeneity is met. What to do if it isn't, is covered in SPSS ANOVA - Levene’s Test “Significant”.

## ANOVA - Flowchart

The flowchart below summarizes when/how to check the ANOVA assumptions and what to do if they're violated. Note that depression.sav contains 4 medicine samples of n = 25 independent observations. It therefore meets all ANOVA assumptions.

## SPSS ANOVA Dialogs

We'll run our ANOVA from Analyze Compare Means One-Way ANOVA as shown below. Next, let's fill out the dialogs as shown below.  Estimate effect size(...) is only available for SPSS version 27 or higher. If you're on an older version, you can get it from Analyze Compare Means Means (“ANOVA table” under “Options”). Tukey's HSD (“honestly significant difference”) is the most common post hoc test for ANOVA. It is listed under “equal variances assumed”, which refers to the homogeneity assumption. However, this is not needed for our data because our sample sizes are all equal.

Completing these steps results in the syntax below.

*BASIC ANOVA EFFECT SIZE AND POST HOC TESTS.

ONEWAY bdi BY medicine
/ES=OVERALL
/MISSING ANALYSIS
/CRITERIA=CILEVEL(0.95)
/POSTHOC=TUKEY ALPHA(0.05).

## SPSS ANOVA Output

First off, the ANOVA table shown below addresses the null hypothesis that all population means are equal. The significance level indicates that p < .001 so we reject this null hypothesis. The figure below illustrates how this result should be reported. What's absent from this table, is eta squared, denoted as η2. In SPSS 27 and higher, we find this in the next output table shown below. Eta-squared is an effect size measure: it is a single, standardized number that expresses how different several sample means are (that is, how far they lie apart). Generally accepted rules of thumb for eta-squared are that

• η2 = 0.01 indicates a small effect;
• η2 = 0.06 indicates a medium effect;
• η2 = 0.14 indicates a large effect.

For our example, η2 = 0.39 is a huge effect: our 4 medicines resulted in dramatically different mean depression scores.

This may seem to complete our analysis but there's one thing we don't know yet: precisely which mean differs from which mean? This final question is answered by our post hoc tests that we'll discuss next.

## SPSS ANOVA - Post Hoc Tests Output

The table below shows if the difference between each pair of means is statistically significant. It also includes 95% confidence intervals for these differences.  Mean differences that are “significant” at our chosen α = .05 are flagged. Note that each mean differs from each other mean except for Placebo versus Homeopathic. If we take a good look at the exact 2-tailed p-values, we see that they're all < .01 except for the aforementioned comparison.

Given this last finding, I suggest rerunning our post hoc tests at α = .01 for reconfirming these findings. The syntax below does just that.

*BASIC ANOVA WITH ALPHA = 0.01 FOR POST HOC TESTS.

ONEWAY bdi BY medicine
/ES=OVERALL
/MISSING ANALYSIS
/CRITERIA=CILEVEL(0.95)
/POSTHOC=TUKEY ALPHA(0.01).

## APA Style Reporting Post Hoc Tests

The table below shows how to report post hoc tests in APA style. This table itself was created with a MEANS command like we used for Descriptive Statistics for Subgroups. The subscripts are based on the Homogeneous Subsets table in our ANOVA output.

Note that we chose α = .01 instead of the usual α = .05. This is simply more informative for our example analysis because all of our p-values < .05 are also < .01.

This APA table also seems available from Analyze Tables Custom Tables but I don't recommend this: the p-values from Custom Tables seem to be based on Bonferroni adjusted independent samples t-tests instead of Tukey's HSD. The general opinion on this is that this procedure is overly conservative.

## Final Notes

Right, so a common routine for ANOVA with post hoc tests is

1. run a basic ANOVA to see if the population means are all equal. This is often referred to as the omnibus test (omnibus is Latin, meaning something like “about all things”);
2. only if we reject this overall null hypothesis, then find out precisely which pairs of means differ with post hoc tests (post hoc is Latin for “after that”).

Running post hoc tests when the omnibus test is not statistically significant is generally frowned upon. Some scenarios could perhaps justify doing so but let's leave that discussion for another day.

Right, so that should do. I hope you found this tutorial helpful. If you've any questions or remarks, please throw me a comment below. Other than that...

# Tell us what you think!

*Required field. Your comment will show up after approval from a moderator.

# THIS TUTORIAL HAS 5 COMMENTS:

• ### By Jon Peck on November 29th, 2021

Regarding CTABLES, the tests provide in addition to Bonferroni, the less conservative Benjamini-Hochberg multiple comparison correction. You can also choose no correction, which may be appropriate in some cases.

• ### By Ruben Geert van den Berg on November 30th, 2021

Hi Jon, thanks for your thoughts!

I wanted to make 2 points:

1) Most standard textbooks seem to recommend Tukey/Games-Howell as the most agreed upon options for equal variances (not) assumed. I wanted to point out that CTABLES uses a different procedure than those. IMHO, it would perhaps have been better to implement Tukey rather than Bonerroni or BH here just to stay inline with standard literature.

2) If I run an independent-samples t-test and manually Bonferroni correct the (equal variances assumed) p-value, it does not correspond to the exact p-value from CTABLES. This made me wonder precisely how the CTABLES Bonferroni p-values are computed but I didn't find anything on that... Did you ever try replicating them?

Also "help -> algorithms" was removed from SPSS since version 18 or 19? Bad decision...

• ### By Jon K Peck on November 30th, 2021

Bonferroni is a generic correction for any multiple testing situation. It doesn't depend on the test type. BH is also a generic multiple testing correction but with a different significance calculation. Both are very popular. BH was added in V24 to compensate for Bonferroni being sometimes too conservative, especially with a lot of tests.

Tukey (equal variances assumed) and GH (equal variances not assumed) are different beasts. Since it is recommended that these not be used unless the overall test rejects, I'm not quite sure how that would fit with CTABLES. When BH was added, there was concern that it would be beyond the typical CTABLES user, but I agree that it would be nice to have more choices.

In fact, I am currently engaged in a consulting project where we run GLM with multiple factors and, with some Python code, replace the tests in CTABLES with the GLM post hoc results, taking into consideration the overall F statistic barrier, too. In that case, the client mostly wants LSD, but it would work with any test type.

I would like to see the Algorithms doc actually integrated into the dialog helps or at least links, and it ought to be on the Help menu, at least. It is referenced from the general help. Nevertheless, it is readily available with all the other docs online and referred to in the regular help. It is available here.
https://www.ibm.com/docs/SSLVMB_28.0.0/pdf/IBM_SPSS_Statistics_Algorithms.pdf
I keep a copy of the pdf on my desktop.

There is a section for CTABLES entitled
Multiple Comparison Adjustments for Column Means and Column Proportions Tests

The details for the column proportions and column means calculations are also spelled out for CTABLES with and without weights or various types.

• ### By Ruben Geert van den Berg on December 1st, 2021

Hi Jon!

If you're working on it anyway: did you try to replicate the CTABLES Bonferroni pairwise mean comparisons p-values via T-TEST? When doing so, I saw a small but non negligible difference that I've no explanation for.

Integrating the algorithms into the dialogs is a great suggestion!

Finally: "the client mostly wants LSD". I guess that's fine as long as they don't consume it during their working hours. It kinda affects your concentration but -then again- also makes you think more creative ;-)

• 