By Ruben Geert van den Berg on September 16, 2014 under SPSS Chi-Square Test Tutorials.

# SPSS Chi-Square Independence Test

The chi-square independence test is a procedure for testing if two categorical variables are independent in some population. This holds if the frequency distribution of one variable is identical for each level of the other variable. If not, there's at least *some* relation between the 2 variables and a table or chart will tell us what this relation looks like.

## SPSS Independent Samples Chi-Square Test Example

A marketeer wants to know the relation between the brand of smartphone people use and the brand they'd *like to* use. She'll first try to establish these are related in the first place by testing the **null hypothesis** that
the current phone brand and the desired phone brand are independent.
She collects data on 150 respondents, resulting in phone_brands.sav, part of which is shown below.

## 1. Quick Data Check

It's a good practice to always inspect your data before running any statistical tests. For the data at hand, a clustered bar chart is a nice option for seeing what the data basically look like. The screenshots below walk you through.

We first navigate to

Next, we select and

.

Click

Select

move `preferred`

to and

`current`

to .

Clicking results in the syntax below.

***Check data with grouped bar chart.**

GRAPH /BAR(GROUPED)=COUNT BY current BY preferred.

The main conclusion from this graph is that smartphone users are quite loyal to brands; users of every brand still prefer the brand they're using. The effect is strongest for HTC users. The four histograms are far from similar; independence between `current`

and `preferred`

doesn't seem to hold even approximately.

## 2. Assumptions Chi-Square Independence Test

Although the chi-square independence test will run just fine in SPSS, the credibility of its results depend on some assumptions. These are

- independent and identically distributed variables (or, less precisely, “independent observations”);
- none of the cells has an expected frequency < 5.

Assumption 1 is mainly theoretical. The precise meaning of assumption 2 is explained in chi-square independence test. SPSS checks this assumption whenever you run this test so we'll see the result of that in a minute in our output.

## 3. Run SPSS Chi-Square Independence Test

We'll navigate to

We'll move `current`

to and

`preferred`

to .

Select under .

Clicking results in the syntax below.

## Syntax

***Run crosstab with chi-square independence test.**

crosstabs current by preferred

/stat chisq.

## 4. SPSS Chi-Square Independence Test Output

We'll first look at the **Crosstabulation** table. Since both variables have 4 answer categories, (4 * 4 =) 16 different combinations may occur in the data. For each combination (or “cell”), the table presents the frequency with which it occurs. We already saw a visual representation of these 16 **observed frequencies** in the graph we ran earlier.

Next, we'll inspect the **Chi-Square Tests** table. Now, the null hypothesis of independence implies that each cell should contain a given frequency. However, the observed frequencies often differ from such **expected frequencies**. The **Pearson Chi-Square** test statistic basically expresses the total difference between the 16 observed frequencies and their expected counterparts; the larger its value, the larger the difference between the data and the null hypothesis.

The p-value, denoted by “Asymp.Sig. (2-tailed)”, is .000. This means that there's a 0% chance to find the observed (or a larger) degree of association between the variables if they're perfectly independent in the population.

## 5. Reporting the Chi-Square Independence Test

We always report the crosstabulation of observed frequencies.“Contingency table” or “bivariate frequency distribution” are synonyms for crosstabulation. Regarding the significance test, we report the Pearson Chi-Square value, df (= degrees of freedom) and p-value as in *“we observed a strong association between the current and the preferred brands, χ ^{2}(9) = 131.2, p = .000.”*

## This Tutorial has 53 Comments

## By Paul-Anthony Kweku somiah on August 5th, 2015

It is very tactful tutorial and that it continues for more researcher to benefit from especially student researchers.

## By Ruben Geert van den Berg on March 16th, 2015

The chi-square test results are not affected by swapping rows and columns. You can choose the option here that's most conventient for you, which may depend on the number of categories in either variable and the question you're aiming to answer.

## By Ounsa on March 16th, 2015

What variables shall I put in the raws and what variables shall I put in the colums

## By Ruben Geert van den Berg on November 10th, 2014

Thank you for you feedback. You're correct that non zero decimals may be present in the p value presented as .000. Perhaps "a close to 0% chance" would have been better here. Then again, you could argue that -scientifically speaking- "0%" could mean anything between -.5% and .5%. Note that I didn't write "exactly 0%". I'm aiming at being as brief as possible in these tutorials and I'm actually working on some more detailed ones in which I'll go into theoretical detail a bit more.

With regard to your last remark: note that some statistical tests (such as the binomial test) use a probability distribution rather than a density function. There's nothing asymptotic about a binomial distribution.

## By Martin F. Sherman on November 8th, 2014

You indicate that the probability of the chi squared test is equal to .000 which you then state "that there's a 0% change to find the observed (or large) degree of association between the variables." This is not exactly correct. The reason the p value is equal .000 is that spss only carried the p value to three decimal places. If you up actually carried the p value out further you would discover that the p value is not equal to .000. It is impossible for the p value to be exactly equal to .000. All of statistically testing is based on probability and theoretical distributions that are asymptomatic to the X axis. Hence, no such thing as a p value equal .000 and that there's 0% chance.