When we start analyzing a data file, we first inspect our data for a number of common problems. For instance, we want to be sure that variables have the right formats, don't contain any weird values and have plausible distributions. The table below proposes which steps should be taken and in which order.
For getting the most out of these tutorials, we encourage you to follow along by downloading and opening hotel_evaluation.sav.
SPSS Data Preparation - Overview Main Steps
|1||Case count and variable count||Inspect data view||(Not applicable)|
|2||Unique Case Identifier||Use AGGREGATE command||Create unique ID variable|
|3||Undesirable Variable Types||Inspect variable view and data view||Convert Variables|
|4||Presence of user missing values||Frequency table or histogram per variable||Specify user missing values|
|5||Variables with many missing values||Bar chart or histogram per variable||Drop variables or exclude from analyses|
|6||Inconvenient distributions||Bar chart or histogram per variable||Drop variables or exclude from analyses|
|7||Small categories||Bar chart per variable||Merge categories|
|8||Undesirable coding||Bar chart per variable||Reverse code variable(s)|
|9||Cases having many missings||Compute number of missing values per case||Exclude Cases from Analysis|
SPSS Data Preparation - Workflow
So how do we perform these checks in practice? We propose you first perform steps 1-3 since they involve the entire data file.
Next, we inspect each variable in our data -from top to bottom- separately. If the variable is categorical, we create a frequency table with a bar chart. If the variable is metric, we run a histogram. Based on this output, we perform steps 4-8.
All output for these steps is created with simple FREQUENCIES commands, which can take multiple variables at once. You may consider running these for groups of (similarly coded) variables rather than variables separately.
After inspecting -and possibly correcting- each variable, we round up with step 9.