A newly updated, ad-free video version of this tutorial
is included in our SPSS beginners course.

Null Hypothesis for the Chi-Square Independence Test

A chi-square independence test evaluates if two categorical variables are associated in some population. We'll therefore try to refute the null hypothesis that two categorical variables are (perfectly) independent in some population. If this is true and we draw a sample from this population, then we may see some association between these variables in our sample. This is because samples tend to differ somewhat from the populations from which they're drawn.
However, a strong association between variables is unlikely to occur in a sample if the variables are independent in the entire population. If we do observe this anyway, we'll conclude that the variables probably aren't independent in our population after all. That is, we'll reject the null hypothesis of independence.

Example

A sample of 183 students evaluated some course. Apart from their evaluations, we also have their genders and study majors. The data are in course_evaluation.sav, part of which is shown below.

SPSS Chi Square Independence Test Variable View 720

We'd now like to know: is study major associated with gender? And -if so- how? Since study major and gender are nominal variables, we'll run a chi-square test to find out.

Assumptions Chi-Square Independence Test

Conclusions from a chi-square independence test can be trusted if two assumptions are met:

independent observations. This usually -not always- holds if each case in SPSS holds a unique person or other statistical unit. Since this is that case for our data, we'll assume this has been met.
For a 2 by 2 table, all expected frequencies > 5.If you've no idea what that means, you may consult Chi-Square Independence Test - Quick Introduction. For a larger table, no more than 20% of all cells may have an expected frequency < 5 and all expected frequencies > 1.

SPSS will test this assumption for us when we'll run our test. We'll get to it later.

Chi-Square Independence Test in SPSS

In SPSS, the chi-square independence test is part of the CROSSTABS procedure which we can run as shown below.

SPSS Chi Square Independence Test Dialog

In the main dialog, we'll enter one variable into the Row(s) box and the other into Column(s). Since sex has only 2 categories (male or female), using it as our column variable results in a table that's rather narrow and high. It will fit more easily into our final report than a wider table resulting from using major as our column variable. Anyway, both options yield identical test results.
Under Stastistics we'll just select Chi-Square. Clicking Paste results in the syntax below.

SPSS Chi-Square Independence Test Syntax

*Crosstabs with Chi-Square test as pasted from menu.

CROSSTABS
/TABLES=major BY sex
/FORMAT=AVALUE TABLES
/STATISTICS=CHISQ
/CELLS=COUNT
/COUNT ROUND CELL.

You can use this syntax if you like but I personally prefer a shorter version shown below. I simply type it into the Syntax Editor window, which for me is much faster than clicking through the menu. Both versions yield identical results.

*Crosstabs with Chi-Square test - short version.

crosstabs major by sex
/statistics chisq.

Output Chi-Square Independence Test

SPSS Chi Square Independence Test Output Case Processing

First off, we take a quick look at the Case Processing Summary to see if any cases have been excluded due to missing values. That's not the case here. With other data, if many cases are excluded, we'd like to know why and if it makes sense.

Contingency Table

SPSS Chi Square Independence Test Crosstab Counts

Next, we inspect our contingency table. Note that its marginal frequencies -the frequencies reported in the margins of our table- show the frequency distributions of either variable separately.
Both distributions look plausible and since there's no “no answer” categories, there's no need to specify any user missing values.

Significance Test

SPSS Chi Square Independence Test Significance Output

First off, our data meet the assumption of all expected frequencies > 5 that we mentioned earlier. Since this holds, we can rely on our significance test for which we use Pearson Chi-Square.
Right, we usually say that the association between two variables is statistically significant if Asymptotic Significance (2-sided) < 0.05 which is clearly the case here.
Significance is often referred to as “p”, short for probability; it is the probability of observing our sample outcome if our variables are independent in the entire population. This probability is 0.000 in our case. Conclusion: we reject the null hypothesis that our variables are independent in the entire population.

Understanding the Association Between Variables

We conclude that our variables are associated but what does this association look like? Well, one way to find out is inspecting either column or row percentages. I'll compute them by adding a line to my syntax as shown below.

*Show only variable/value labels in output.

set tvars labels tnumbers labels.

*Crosstabs with frequencies and row percentages.

crosstabs major by sex
/cells count row
/statistics chisq.

Adjusting Our Table

Since I'm not too happy with the format of my newly run table, I'll right-click it and select Edit Content In Separate Window

SPSS Pivot Table Edit Content Separate Window

We select Pivoting Trays and then drag and drop Statistics right underneath “What's your gender?”. We'll close the pivot table editor.

Result

SPSS Chi Square Independence Test Association

Roughly half of our sample if female. Within psychology, however, a whopping 87% is female. That is, females are highly overrepresented among psychology students. Like so, study major “says something” about gender: if I know somebody studies psychology, I know she's probably female.
The opposite pattern holds for economy students: some 80% of them are male. In short, our row percenages describe the association we established with our chi-square test.
We could quantify the strength of the association by adding Cramér’s V to our test but we'll leave that for another day.

Reporting a Chi-Square Independence Test

We report the significance test with something like “an association between gender and study major was observed, χ²(4) = 54.50, p = 0.000. Further, I suggest including our final contingency table (with frequencies and row percentages) in the report as well as it gives a lot of insight into the nature of the association.

So that's about it for now. Thanks for reading!

THIS TUTORIAL HAS 72 COMMENTS:

By Aliu on August 18th, 2015

Beautiful tutorials, quite helpful & easy to understand. Good job.
By sher bahadar khan on December 30th, 2015

As we know that if p value is less than 0.05 it is considered as significant, so if the p value = .000 it means the highest significance?
By Ruben Geert van den Berg on December 30th, 2015

Short answer: yes.

Long answer: it's the highest statistical significance. It only means there's a (roughly) 0 probability that the association between variables is zero in the population (variables perfectly independent).

However, it does not mean that the variables are strongly associated; because the p value depends on the sample size, a minor association with a huge sample size can result in a p value of 0.000. A table (or better: a chart) gives a much better idea of how (strongly) variables are associated.

We therefore recommend you always add it to the p value. Also see this tutorial.
By Elaine Tennant on January 18th, 2016

Thank you for this helpful tutorial. I am analysing how speciality affects whether a certain test is done or not. So I have a categorical dependent variable and a categorical nominal independent variable with six levels. When I've done chi squared tests in the past and the independent variable has two levels (y/n), I have used the fishers exact test as the statistic if any cells have the expected count < 5. My output for this particular test shows that six cells have expected count less than five but a result for fishers is not reported. Can I use the test statistic for the pearson chi square? Many thanks for your help.
By Ruben Geert van den Berg on January 18th, 2016

Dear Elaine,

The Fisher exact test can only be reported for 2 * 2 tables which is why you don't see it for a 6 * 2 table.

Keep in mind that "all expected frequencies < 5" is nothing more than a rough rule of thumb; if some cells have an expected frequency of like 4.8 or something, I wouldn't bother too much.

However, if some cells have a very small frequency, you may consider merging some of the categories of the categorical variable. This will result in a smaller crosstab and hence reduces the chance of finding many (nearly) empty cells.

1 2 … 15

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

SPSS Chi-Square Independence Test Tutorial

Null Hypothesis for the Chi-Square Independence Test

Example

Assumptions Chi-Square Independence Test

Chi-Square Independence Test in SPSS

SPSS Chi-Square Independence Test Syntax

Output Chi-Square Independence Test

Contingency Table

Significance Test

Understanding the Association Between Variables

Adjusting Our Table

Result

Reporting a Chi-Square Independence Test

Tell us what you think!

THIS TUTORIAL HAS 72 COMMENTS:

By Aliu on August 18th, 2015

By sher bahadar khan on December 30th, 2015

By Ruben Geert van den Berg on December 30th, 2015

By Elaine Tennant on January 18th, 2016

By Ruben Geert van den Berg on January 18th, 2016