# SPSS Tutorials

BASICS REGRESSION T-TEST ANOVA CORRELATION

# SPSS – Process Variables Based on Names

We held a reaction time experiment in which people had to resolve 20 puzzles as fast as possible. The 20 answers and reactions times are in an SPSS data file which we named trials.sav. Part of if is shown below.

In first instance, we'd just like to inspect the histograms of our reaction time variables. The easiest way is running FREQUENCIES as shown below but specifiying the right variable names is cumbersome even for this very tiny data file.

 ```1 2 3 4 5``` *Required command may look something like... frequencies Trial_1_Reaction_Time_Milliseconds Trial_2_Reaction_Time_Milliseconds Trial_3_Reaction_Time_Milliseconds /*and so on through 20. /format notable /histogram.

## 1. Retrieve Variable Names by Index

A very easy option here is to include only those variables in our command that have “Time” in their name. Filtering variables by (part of their) names, labels, measurement levels or variable types could hardly be any easier if we have Python do it for us.
Now, we've 43 variables in our data but Python starts counting from 0. So our first and last variables should be indexed 0 and 42 by Python. Let's see if that's right by running the syntax below.

 ```1 2 3 4 5 6 7``` *Retrieve names of first and last variable by Python index. begin program. import spss print spss.GetVariableName(0) print spss.GetVariableName(42) end program.

## 2. Retrieve All Variable Indices

So if we can retrieve the first and last variable names by their Python indices -0 and 42- then we can retrieve all of them if we have all indices. A standard way for doing just that is using the Python range method. As shown below, it generates a Python list holding all indices.

 ```1 2 3 4 5 6 7``` *Create Python list of all variable indices. begin program. import spss print spss.GetVariableCount() print range(spss.GetVariableCount()) end program.

## 3. Retrieve All Variable Names

We'll now run a very simple Python loop over our variable indices. In each iteration, we'll retrieve one variable name, resulting in the names of all variables in our data. We'll filter out our target variables in the next step.

 ```1 2 3 4 5 6 7``` *Retrieve all variable names. begin program. import spss for ind in range(spss.GetVariableCount()):     print spss.GetVariableName(ind) end program.

## 4. Filter Variable Names

Before we'll create and run the required syntax, we still need to filter out only those variables having “Time” in their variable names. We'll use a very simple Python if statement for doing so. The syntax below retrieves exactly our target variables.

 ```1 2 3 4 5 6 7 8 9``` *Retrieve all variable names holding "Time". begin program. import spss for ind in range(spss.GetVariableCount()):     varNam = spss.GetVariableName(ind)     if 'Time' in varNam:         print varNam end program.

## 5. Create Python String Holding Target Variables

We'd like to create some SPSS syntax containing the variables we selected in the previous syntax. We'll first pass the names into a Python string we'll call `timeVars`. We first create it as an empty string and then concatenate each variable name and a space to it.

 ```1 2 3 4 5 6 7 8 9 10 11``` *Create Python string holding all reaction time variables. begin program. import spss timeVars = '' for ind in range(spss.GetVariableCount()):     varNam = spss.GetVariableName(ind)     if 'Time' in varNam:         timeVars += varNam + ' ' print timeVars end program.

## 6. Create Required SPSS Syntax

We're almost there now. The main step left is to create a the basic FREQUENCIES command we're after and insert the string holding variable names into it. We'll do so with a simple Python text replacement.

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16``` *Create entire SPSS command for histograms. begin program. import spss timeVars = '' for ind in range(spss.GetVariableCount()):     varNam = spss.GetVariableName(ind)     if 'Time' in varNam:         timeVars += varNam + ' ' spssSyntax = ''' FREQUENCIES %s /FORMAT NOTABLE /HISTOGRAM. '''%timeVars print spssSyntax end program.

## 7. Create and Run Desired Syntax

Let's take a close look at the syntax we just created. Is it exactly what we need? Sure? Then we'll comment out the `print` command and run our syntax with `spss.Submit` instead.

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17``` *Create and run required command. begin program. import spss timeVars = '' for ind in range(spss.GetVariableCount()):     varNam = spss.GetVariableName(ind)     if 'Time' in varNam:         timeVars += varNam + ' ' spssSyntax = ''' FREQUENCIES %s /FORMAT NOTABLE /HISTOGRAM. '''%timeVars #print spssSyntax spss.Submit(spssSyntax) end program.

## Final Notes

While developing a solution, we quietly assumed we weren't allowed to make any changes to the data. In our opinion, the very first thing that should be done here is set the crazy long variable names as variable labels and use short variable names instead.

Also, we'd rather reorder our variables (first all reaction times, then all answers). The basic way for doing so is shown in SPSS - Reorder Variables with Syntax. Like so, we can use TO in a short FREQUENCIES command.

# Let me know what you think!

*Required field. Your comment will show up after approval from a moderator.

# This tutorial has 6 comments

• ### By Esther Fujiwara on October 23rd, 2016

Wow, thank you, the commented example is great! I figured 'and' would make sense, I just didn't know the syntax structure. And yes, I think more examples are always good :-).