SPSS Variable Types and Formats Tutorial
Understanding SPSS variable types and formats allows you to get things done fast and reliably. Getting a grip on types and formats is not hard if you ignore the very confusing information under variable view. This tutorial takes away the confusion and puts you back in control.
We encourage you to follow along with this tutorial by downloading and opening computer_parts.
SPSS Variable Types
SPSS has two variable types: string and numeric. Numeric variables may contain only numbers. String variables may contain letters, numbers and other characters. The distinction between numeric and string variables is important because the variable type dictates what you can or cannot do with a variable.
- You can do calculations with numeric variables but not with string variables.
- You can use string functions such as taking substrings or concatenating with string variables but not with numeric variables.
There are no other variable types in SPSS than string and numeric. However, numeric variables have several different formats that are often confused with variable types. We'll see in a minute how SPSS variable view puts many users on the wrong track here.
Determining SPSS Variable Types
Before doing anything whatsoever with a variable, we always want to know whether it's a string or numeric variable. Don't rely on a visual inspection of your data view for determining variable types; it may be hard, sometimes impossible to see the difference between the two variable types. Instead, inspect your variable view and use the following rule:
- if “Type” is “String”, you're dealing with a string variable;
- if “Type” is anything else than “String”, you're dealing with a numeric variable.
SPSS suggests that “Date” and “Dollar” are variable types as well. However, these are formats, not types. The way they are shown here among the actual variable types (string and numeric) is one of SPSS’ most confusing features.
SPSS Variable Formats - Introduction
Let's now have a look at the data under data view as shown the screenshot below. We'll briefly describe the kinds of variables we see.
The first variable holds words;
The second variable holds numbers with two decimal places;
The third variable holds dates;
The fourth variable holds times;
The fifth variable holds dates and times;
The sixth variable holds percentages;
The seventh variable holds numbers of dollars with two decimal places.
Regarding these data, we concluded earlier that is a string variable and variables through are numeric. Remember that numeric variables can contain only numbers. However, SPSS can display these numbers in very different ways. At this point we see that numeric values have two components:
- first there's the actual values as SPSS stores them internally. These consist of nothing but numbers.
- Second, the actual values can be displayed and treated in a myriad of different ways. Like so, numeric variables may seem to contain letters of months or dollar signs.
These different ways of displaying and treating the actual values are referred to as variable formats.
Determining SPSS Variable Formats
As we saw earlier, “Type” under variable view shows a confusing mixture of variable types and formats. Unfortunately, it doesn't allow us to determine the actual formats. However, the following line of syntax does the trick here: display dictionary. After running it, we see one or more tables with dictionary information in the Output Viewer window as shown by the screenshot below.
SPSS distinguishes print and write formats but we don't bother about this distinction. SPSS variable formats consist of two parts. One or more letters indicate the format family. Most of them speak to themselves, except for the first two variables:
- A (“Alphanumeric”) is the usual format for string variables;
- F, (“Fortran”) indicates a standard numeric variable.
Formats end with numbers, indicating the number of characters to be shown.Strictly, in the case of string variables the number indicates the maximum number of bytes that each value may consist of. For more on this, see Unicode mode If a period is present, the number after the period indicates the number of decimal places to be displayed.
SPSS Common Variable Formats
The table below disambiguates variable types, format families and formats for the data we've been studying so far.
|Variable Type||Format family||Format (example)||Shown as|