# SPSS tutorials

BASICS DATA ANALYSIS T-TEST ANOVA CHI-SQUARE TEST

# SPSS Create Dummy Variables Tool

## Summary

Creating dummy variables for several categorical variables by basic syntax is usually not hard. However, applying proper variable labels to the newly created dummy variables requires quite a bit of effort. The tool presented in this tutorial will take care of this -and some other issues- more easily.

## SPSS Dummy Variables Tool

SPSS Create Dummy Variables Tool

## Instructions

• Make sure you have SPSS version 17 or higher and the SPSS Python Essentials properly installed.
• Download and install SPSS Create Dummy Variables Tool. Note that this is an SPSS custom dialog.
• Go to Utilities Create Dummy Variables. Fill in the name(s) of the variables you'd like to dummy code.
• Click and run the pasted syntax.
• This will create dummy variables with variable labels and value labels automatically applied to them. Note that their variable names consist of (the original variable name) + a suffix + (a numeric index).
• Clicking the tool's button will take you to this tutorial.

## Overview Result Dummy Variables Tool

Let's say we'd like to dummify “education_type” from employees.sav. First note that 5 value labels have been defined for this variable. This can be seen under variable view as shown below. Using the dummy variables tool results in 6 new dummy variables, the variable labels of which contain the values and corresponding value labels of the values they represent. This is shown in the screenshot below.

SPSS Dummy Variables Tool - Result

## Final Notes

First, note that the suffixes for the new variable names (for instance, “_d1” in our example) don't have any substantive meaning. That is, “_d1” says nothing about which value this variable represents (in fact, it represent a system missing value in our example). The actual meaning of the dummy variables is solely contained in their variable labels.

Second, a dummy variable will be created for each distinct value that's actually present in the original variable, regardless whether a value label has been defined for it. A value that does not occur in the original variable (but may have a value label nevertheless) does not need a dummy variable and is therefore skipped.

## SPSS Python Syntax Example

Instead of using the Custom Dialog we just discussed, you may click here for an SPSS Python syntax version of this tool. It includes the creation of some nasty test data we used for testing the tool (containing labelled and unlabelled string variables, user missing values, system missing values and so on).

********00. CREATE NIGHTMARE TEST DATA FOR DUMMIFYING.

set seed 2.

data list free/v1(a10).
begin data
"don't know" rat bat dog cat '' 'don"t know'
end data.

string v2(a10).
compute v2 = v1.

do repeat @v = v3 to v4.
compute @v = rv.binom(3,.5).
end repeat.

if \$casenum = 5 v3 = \$sysmis.
missing values v3 v4 (2).
if \$casenum = 4 v3 = 4.

value labels v2 'cat' 'CAT!' 'rat' 'RAT!'.
value labels v3 0 'No' 1 'Yes' 2 'Don''t know' 4 'Don"t know'.
value labels v4 0 'Bad' 1 'Good' 2 'Not applicable'.

execute.

********10. DEFINE FUNCTION.

begin program.
def dummify(varSpec,sep = '_d'):
import spss,spssaux,spssdata
varDict = spssaux.VariableDict()
varList = varDict.expand(varSpec)
varList.sort(key = lambda x: varDict.VariableIndex(x))
for var in varList:
type = varDict.VariableType(var) # 0 = numeric, else strlength
if type == 0: #numeric variable, spssdata as floats but ValueLabels strings so convert to floats
valList = sorted(set([val[0] for val in spssdata.Spssdata(var,convertUserMissing=False).fetchall()]))
valLabs = dict((float(key),val) for key,val in varDict.ValueLabels(var).items())
else:
valList = sorted(set([val[0].strip() for val in spssdata.Spssdata(var,convertUserMissing=False).fetchall()]))
valLabs = varDict.ValueLabels(var)
for cnt,val in enumerate(valList):
varLabExt = ' (""' + valLabs[val].replace('"','""') + '"")' if val in valLabs else ''
if type > 0:
val = val.strip().replace('"','""')
spss.Submit('''recode %s ("%s" = 1)(else = 0) into %s%s%d.'''%(var,val,var,sep,cnt + 1))
spss.Submit('''variable labels %s%s%d "Dummy variable indicating that %s = ""%s""%s.".'''\
%(var,sep,cnt + 1,var,val,varLabExt))
elif val == None:
spss.Submit('''recode %s (sysmis = 1)(else = 0) into %s%s%d.'''%(var,var,sep,cnt + 1))
spss.Submit('''variable labels %s%s%d "Dummy variable indicating that %s = (system missing).".'''%(var,sep,cnt + 1,var))
else:
spss.Submit('''recode %s (%f = 1)(else = 0) into %s%s%d.'''%(var,val,var,sep,cnt + 1))
spss.Submit('''variable labels %s%s%d "Dummy variable indicating that %s = %s%s.".'''%(var,sep,cnt + 1,var,str(val),varLabExt)) #converting float to str suppresses excessive decimal places in varlab
spss.Submit('''value labels %s%s%d 0 'False' 1 'True'.'''%(var,sep,cnt + 1))
spss.Submit('execute.')
end program.

******20. TEST V1******.

*match files file */keep v1 to v4.
*execute.
output close all.

begin program.
dummify('v1')
end program.

******30. TEST V2******.

*match files file */keep v1 to v4.
execute.
output close all.

begin program.
dummify('v2')
end program.

******40. TEST V3******.

*match files file */keep v1 to v4.
execute.
output close all.

begin program.
dummify('v3',sep = '_dummy_')
end program.

******50. TEST V4******.

*match files file */keep v1 to v4.
execute.
output close all.

begin program.
dummify('v4')
end program.

# Let me know what you think!

*Required field. Your comment will show up after approval from a moderator.

# This tutorial has 13 comments

• ### By Ruben Geert van den Berg on November 1st, 2016

Hi Amna! Almost right: you have to tick "Include the SPSS Python Essentials" during the installation process. If you don't, then you have SPSS without the Python essentials and many users have a lot of trouble installing the Python essentials afterwards. Happy to hear the problem's been solved for now!

• ### By Amna on October 31st, 2016

Hey Rupert! I have SPSS 21- I thought Python was preinstalled with that? I installed it, anways, but it is still not working... Doesn't matter, your syntax is great too, I have been using it. Thanks anyways for your help! BR from Vienna, Austria

• ### By Ruben Geert van den Berg on October 31st, 2016

Right. Well, I always encourage SPSS users to install the SPSS Python essentials because many tools -the dummifying tool but many other too- require it. Now, the dummifying tool is a nice time and effort saver but you don't really need it. If you can't use it for whatever reason, you can always compute the dummy variables manually as explained in computing dummy variables.

Hope that helps!