By Ruben Geert van den Berg on September 16, 2014 under SPSS Chi-Square Test Tutorials.

SPSS Chi-Square Independence Test

SPSS Chi-Square Independence Test - What Is It

The chi-square independence test is a procedure for testing if two categorical variables are independent in some population. This holds if the frequency distribution of one variable is identical for each level of the other variable. If not, there's at least some relation between the 2 variables and a table or chart will tell us what this relation looks like.

SPSS Independent Samples Chi-Square Test Example

A marketeer wants to know the relation between the brand of smartphone people use and the brand they'd like to use. She'll first try to establish these are related in the first place by testing the null hypothesis that the current phone brand and the desired phone brand are independent. She collects data on 150 respondents, resulting in phone_brands.sav, part of which is shown below.

Phone Brands Data - Data View

1. Quick Data Check

It's a good practice to always inspect your data before running any statistical tests. For the data at hand, a clustered bar chart is a nice option for seeing what the data basically look like. The screenshots below walk you through.

SPSS Chi-Square Independence Test Bar Chart Dialog

We first navigate to Graphs SPSS Menu Arrow Legacy Dialogs SPSS Menu Arrow Bar
Next, we select Clustered and
Summaries for groups of cases.
Click Define

SPSS Chi-Square Independence Test Bar Chart Dialog

Select N of cases,
move preferred to Category Axis and
current to Define clusters by.
Clicking Paste results in the syntax below.

*Check data with grouped bar chart.

GRAPH /BAR(GROUPED)=COUNT BY current BY preferred.
SPSS Chi-Square Independence Test Bar Chart

The main conclusion from this graph is that smartphone users are quite loyal to brands; users of every brand still prefer the brand they're using. The effect is strongest for HTC users. The four histograms are far from similar; independence between current and preferred doesn't seem to hold even approximately.

2. Assumptions Chi-Square Independence Test

Although the chi-square independence test will run just fine in SPSS, the credibility of its results depend on some assumptions. These are

  1. independent and identically distributed variables (or, less precisely, “independent observations”);
  2. none of the cells has an expected frequency < 5.

Assumption 1 is mainly theoretical. The precise meaning of assumption 2 is explained in chi-square independence test. SPSS checks this assumption whenever you run this test so we'll see the result of that in a minute in our output.

3. Run SPSS Chi-Square Independence Test

SPSS Chi-Square Independence Test Dialog

We'll navigate to Analyze SPSS Menu Arrow Descriptive Statistics SPSS Menu Arrow Crosstabs

SPSS Chi-Square Independence Test Dialog

We'll move current to Row(s) and
preferred to Column(s).
Select Chi-square under Statistics.
Clicking Paste results in the syntax below.


*Run crosstab with chi-square independence test.

crosstabs current by preferred
/stat chisq.

4. SPSS Chi-Square Independence Test Output

SPSS Chi-Square Independence Test Output

We'll first look at the Crosstabulation table. Since both variables have 4 answer categories, (4 * 4 =) 16 different combinations may occur in the data. For each combination (or “cell”), the table presents the frequency with which it occurs. We already saw a visual representation of these 16 observed frequencies in the graph we ran earlier.

SPSS Chi-Square Independence Test Output

Next, we'll inspect the Chi-Square Tests table. Now, the null hypothesis of independence implies that each cell should contain a given frequency. However, the observed frequencies often differ from such expected frequencies. The Pearson Chi-Square test statistic basically expresses the total difference between the 16 observed frequencies and their expected counterparts; the larger its value, the larger the difference between the data and the null hypothesis.
The p-value, denoted by “Asymp.Sig. (2-tailed)”, is .000. This means that there's a 0% chance to find the observed (or a larger) degree of association between the variables if they're perfectly independent in the population.

5. Reporting the Chi-Square Independence Test

We always report the crosstabulation of observed frequencies.“Contingency table” or “bivariate frequency distribution” are synonyms for crosstabulation. Regarding the significance test, we report the Pearson Chi-Square value, df (= degrees of freedom) and p-value as in “we observed a strong association between the current and the preferred brands, χ2(9) = 131.2, p = .000.”

Related Tutorials

Simple Overview Statistical Comparison Tests

Which statistical test should you use? This overview makes it easy with simple visualizations and requirements for all major statistical comparison tests. Read more

Chi-Square Test – What Is It?

A chi-square test evaluates whether two categorical variables are related. This tutorial explains the chi-square test in normal language. With illustrations, without mathematical formulas. Read more

Association between Categorical Variables

This tutorial walks through running nice tables and charts for investigating the association between categorical or dichotomous variables. We'll demonstrate some cool SPSS tricks along the way. Read more

Comment on this Tutorial

*Required field. Your comment will show up after approval from a moderator.

This Tutorial has 53 Comments