Our previous lesson started out with a rather problematic data file. We reordered our variables with Python and saved our data as trials-ordered.sav, the starting point for this lesson. The screenshot below shows part of the data.
We find the long variable names problematic for two reasons: first, typing them into a syntax window is too much work and results in overly long, unmanageable syntax. More importantly, the underscores don't look nice in our output but variable names can't hold spaces instead.
We'll therefore set our variable names as variable labels and replace the underscores by spaces. Finally, we'll replace the long names by nice and short ones.
1. Retrieve All Variable Names from Data
Retrieving all variable names from our data is a standard technique that we cover in Sort Variables in SPSS with Python. We'll do it with the syntax below once again.
begin program python3.
import spss
for ind in range(spss.GetVariableCount()):
varNam = spss.GetVariableName(ind)
print(varNam)
end program.
2. Retrieve All Variable Labels from Data
Note that our first 3 variables already have a label. In order to ensure we don't overwrite them, we'll now inspect all variable labels as well, which is a simple function covered by the spss module.
begin program python3.
import spss
for ind in range(spss.GetVariableCount()):
varNam = spss.GetVariableName(ind)
varLab = spss.GetVariableLabel(ind)
print(varLab)
end program.
3. Create Variable Labels with Python
If some variable does not have a label yet, Python will return an empty string. We'll check if this holds with if not varLab:
, which is True
if the label is empty. For those variables, we'll create a variable label by replacing the underscores in their names by spaces. For now, we'll just print these labels.
begin program python3.
import spss
for ind in range(spss.GetVariableCount()):
varNam = spss.GetVariableName(ind)
varLab = spss.GetVariableLabel(ind)
if not varLab: # True if varLab is empty string (no VARIABLE LABEL set)
varLab = varNam.replace("_"," ")
print(varLab)
end program.
4. Create VARIABLE LABELS Commands
We'll now create and inspect VARIABLE LABELS commands with
"VARIABLE LABELS %s '%s'."%(varNam,varLab)
The first %s
is replaced by the variable name, the second by its newly created label. This technique is explained in SPSS Python Text Replacement Tutorial.
begin program python3.
import spss
for ind in range(spss.GetVariableCount()):
varNam = spss.GetVariableName(ind)
varLab = spss.GetVariableLabel(ind)
if not varLab:
varLab = varNam.replace("_"," ")
print("VARIABLE LABELS %s '%s'."%(varNam,varLab))
end program.
Note: since we use single quotes around the variable label, there may be no single quotes within it. This is no issue here but if it is, escape each single quote within the label with 2 single quotes.
Result
5. Run VARIABLE LABELS Commands
Since our VARIABLE LABELS commands look fine, we'll now have Python run them in SPSS. We basically just replace print
with spss.Submit
and add some parentheses. This concludes the first part of our job.
begin program python3.
import spss
for ind in range(spss.GetVariableCount()):
varNam = spss.GetVariableName(ind)
varLab = spss.GetVariableLabel(ind)
if not varLab:
varLab = varNam.replace("_"," ")
spss.Submit("VARIABLE LABELS %s '%s'."%(varNam,varLab))
end program.
6. RENAME VARIABLES
At this point, at least our output will look good if we display only variable labels (not names) with SET TVARS LABELS. However, we still prefer nice and short variable names. If we use the TO keyword, a single line RENAME VARIABLES command does the trick for us.
rename variables(Trial_1_Reaction_Time_Milliseconds to Trial_20_Answer = time_1 to time_20 answer_1 to answer_20).
Final Result
As shown, our variable names are now nice and short and we've decent variable labels as well. This'll surely pay off when further editing or analyzing these data...
Thanks for reading!