This tutorial will investigate the association between metric variables with nice tables and charts. As an example, we'll use income_2010 and income_2011 from freelancers.sav.
Quick Data Check
Before jumping into analyses, let's first just inspect whether both variables have plausible values. A fast way for doing so is generating a histogram by running FREQUENCIES. The syntax below does just that.
frequencies income_2010 income_2011
Finding and Specifying User Missing Values
Conclusion: although the histogram for income_2010 looks fine, income_2011 seems to have some extremely large value(s) that don't indicate yearly incomes.
One way to track these down is running FREQUENCIES and sorting the table descendingly by value (syntax below, step 1); we'll now see these unlikely values at the top of the frequencies table. This shows that income_2011 contains 99999997 which we'll specify as a user missing value.
We'll also hide the decimals of the values in both variables (step 3). This somewhat suppresses excessive decimal places in output tables. Note that a nice tool for doing so after running tables is available from SPSS Set Decimals Output Tables.
SPSS Find and Set Missing Values Syntax
*2. Specify 99999997 as user missing.
missing values income_2011(99999997).
*3. Hide dollar cents for more space on x axis.
formats income_2010 income_2011(dollar9).
*4. Quick check.
frequencies income_2010 income_2011
SPSS DESCRIPTIVES Table
A univariate DESCRIPTIVES table doesn't say anything about the association between two metric variables. However, as it's commonly included in reports, we'll run one too.
Optionally, styling can be applied by using an SPSS table template (.stt file). Like so, we'll hide its title.“Valid N (listwise)” can't be hidden with a table template but we used a Python script for doing so. Our final result is shown in the next screenshot.
descriptives income_2010 income_2011.
Creating SPSS Scatterplots
A great way for visualizing the association -if any- between metric variables is running a scatterplot. The screenshots below show how to do so.
For creating multiple scatterplots, copy-paste the syntax a couple of times and replace the variable names. For creating many scatterplots, have Python loop over the variable names and run the syntax for you.
SPSS Scatterplot Syntax
Note: the graph resulting from running the syntax can be styled by applying an SPSS chart template (.sgt file). The screenshot below shows our final result after doing so.
/SCATTERPLOT(BIVAR)=income_2010 WITH income_2011
/TITLE='All Respondents (n = 39)'.
SPSS Scatterplot Example
Conclusion: income_2010 is very strongly related to income_2011. This relation is roughly linear.Honestly, the relation didn't look entirely linear to us. However, a quick CURVEFIT showed us that alternative models deviate from a linear relation to a negligible extent only. That is, the relation turned out to be much more linear than it seemed at first glance.
SPSS CORRELATIONS Table
We already saw from our scatterplot that our two variables are strongly related in a linear fashion. It can be quantified by calculating a Pearson correlation by running CORRELATIONS.
The resulting table always contains p-values but these are nonsensical if their statistical assumptions haven't been met. We'll therefore hide them with a simple tool available from SPSS Correlations without Significance.
correlations income_2010 with income_2011.
Conclusion: the correlation of .904 confirms that there is a very strong linear relation between income_2010 and income_2011 indeed.