Python for SPSS - How to Use It?
SPSS tutorials website header logo SPSS TUTORIALS VIDEO COURSE BASICS ANOVA REGRESSION FACTOR

Python for SPSS – How to Use It?

SPSS Python Essentials

First off, using Python in SPSS always requires that you have

These components are collectively known as the SPSS Python essentials. For recent SPSS versions, the Python essentials are installed by default. One way to check this is navigating to Edit SPSS Menu Arrow Options SPSS Menu Arrow File Locations in which you'll probably find some Python location(s) as shown below.

SPSS Python Location In Edit Options

So what should you see here? Well,

Run Python from SPSS Syntax Window

Right. So if you've SPSS with the Python essentials properly installed, what's next?

Well, the simplest way to go is to run Python from an SPSS syntax window. Enclose all lines of Python between BEGIN PROGRAM PYTHON3. and END PROGRAM. as shown below.

Python 3 Program Block In SPSS Syntax Window

Try and copy-paste-run the entire syntax below. Note that this Python block simply lowercases all variable names, regardless what or how many they are.

*SPSS syntax for creating empty test data.

data list free/V1 V2 v3 v4 EDUC gender SAlaRY.
begin data
end data.

*Run Python block for lowercasing all variable names.

begin program python3.
import spss,spssaux
oldNames = spssaux.GetVariableNamesList()
newNames = [var.lower() for var in oldNames]
spss.Submit("RENAME VARIABLES (%s = %s)."%(' '.join(oldNames),' '.join(newNames)))
end program.

Wrap Python Code into Functions

Right, so we just ran some Python from an SPSS syntax window. Now, this works fine but doing so has some drawbacks:

A first step towards resolving these issues is to first wrap our Python code into a Python function.

*Create empty test data.

data list free/V1 V2 v3 v4 EDUC gender SAlaRY.
begin data
end data.

*Define lowerCaseVars as Python function.

begin program python3.
def lowerCaseVars():
    import spss,spssaux
    oldNames = spssaux.GetVariableNamesList()
    newNames = [var.lower() for var in oldNames]
    spss.Submit("RENAME VARIABLES (%s = %s)."%(' '.join(oldNames),' '.join(newNames)))
end program.

*Run function.

begin program python3.
lowerCaseVars()
end program.

Note that we first define a Python function and then run it. Like so, you can develop a single SPSS syntax file containing several such functions.

Running this file just once (preferably with INSERT) defines all of your Python functions. You can now use these for all projects you'll work on during your SPSS session.

Write Your Own Python Module

We just defined and then ran a function. The next step is moving our function into a Python file: a plain text file with the .py extension that we'll place in C:\Program Files\IBM\SPSS Statistics\Python3\Lib\site-packages or wherever our site-packages folder is located.

Python Module In Site Packages Folder

We can now edit this file with Notepad++, which is much nicer than SPSS’ syntax editor. Since a Python file contains only Python, we'll leave out BEGIN PROGRAM PYTHON3. and END PROGRAM.

Python Module Contents

If we now import our module in SPSS, we can readily run any function it contains as shown below.

*Create empty test data.

data list free/V1 v2 V3 V4 v5 V6.
begin data
end data.

*Import module and lowercase variable names.

begin program python3.
import ruben
ruben.lowerCaseVars()
end program.

Developing and using our own Python module has great advantages:

A quick tip: if you're developing your module, reload it after each edit.

*Tip: if you're editing your module, reload it before each use.

begin program python3.
import ruben,importlib # import ruben and importlib modules
importlib.reload(ruben) # use importlib to reload ruben module
ruben.lowerCaseVars() # run function from ruben module
end program.

Create an SPSS Extension

SPSS extensions are tools that can be developed by all SPSS users for a wide variety of tasks. For an outstanding collection of SPSS extensions, visit SPSS Tools - Overview.

Extensions are easy to install and can typically be run from SPSS menu dialogs as shown below.

SPSS Create All Scatterplots Tool Dialog 2

So how does this work and what does it have to do with Python?

Well, most extensions define new SPSS syntax commands. These are not much different from built-in commands such as FREQUENCIES or DESCRIPTIVES. The syntax below shows an example from SPSS - Create All Scatterplots Tool.

*Fit all possible curves for 4 predictors onto single dependent variable.

SPSS TUTORIALS SCATTERS YVARS=costs XVARS=alco cigs exer age
/OPTIONS ANALYSIS=FITALLTABLES ACTION=RUN.

Now, running this SPSS syntax command basically passes its arguments -such as input/output variables, values or titles- on to an underlying Python function and runs it. This Python function, in turn, creates and runs SPSS syntax that gets the final job done.

Note that SPSS users don't see any Python when running this syntax -unless they can make the Python code crash. For actually seeing the Python code, you may unzip the SPSS extension (.spe) file and look for some Python (.py) file in the resulting folder.

Unzip SPSS Extension File Unzipping an SPSS extension (.spe) file results in a folder in which you'll usually find a Python (.py) file

Some final notes on SPSS extensions is that developing them is seriously challenging and takes a lot of practice. However, well-written extensions can save you tons of time and effort over the years to come.

Thanks for reading!

Python for SPSS – What is It?

Serious SPSS users have probably heard that using Python in SPSS can dramatically speed up
your daily SPSS tasks.
So what is it and how does it work? This brief tutorial quickly walks you through.

Introduction

Python is one of the main general programming languages today and was first launched in 1989.

Python Logo

SPSS -short for “statistical package for the social sciences”- is user friendly software for data editing/analysis and statistical procedures. SPSS is much older than Python: it's already been in use since 1968. Originally, Python had nothing whatsoever to do with SPSS. They were simply 2 completely unrelated software packages until roughly 2005.

So How does Python relate to SPSS?

Around 2005, the SPSS developers created software that connects SPSS with Python: the SPSS-Python plugin. This plugin made it possible to

The figure below sketches some of such interactions between SPSS and Python.

Overview SPSS Python Interactions

What Is Python Known For?

Why Should I Use Python in SPSS?

SPSS Python - Shorter Syntax Which Approach Looks Better to You?

Where Can I Get Python for SPSS?

Finally an easy question... Recent SPSS versions are integrated with Python by default. It is located in the Python3 folder in your SPSS installation folder as shown below.

Ss Python 3 Folder In SPSS Folder

For more details on this, read up on Python for SPSS - How to Use It?

Which are Some SPSS Python Examples?

Some excellent SPSS-Python code is found in many of our SPSS tools:

SPSS Create All Scatterplots Tool Dialog 2 Example tool that uses Python for SPSS

Note: when using these tools, you don't immediately see the underlying Python code. However, if you unzip the SPSS extension (.spe) files, you'll find that each of them contains a Python (.py) file that contains the SPSS-Python code being used.

Thanks for reading!

Main Differences Python and SPSS Syntax

Python is Fully Case Sensitive

Experienced SPSS users probably know that SPSS syntax is mostly case insensitive: it doesn't usually matter whether you write syntax in lower case, upper case or a mixture of these. Like so, the FREQUENCIES commands below are all equivalent.

*SPSS SYNTAX IS MOSTLY CASE INSENSITIVE.

FREQUENCIES V01 TO V10.

frequencies v01 to v10.

fReQuEnCiEs v01 tO V10.

In contrast, Python is fully case sensitive. As a consequence, you always need to use the exact right casing in Python or you may otherwise trigger a wide variety of errors and warnings.

*PYTHON IS FULLY CASE SENSITIVE.

begin program python3.
myName = 'Ruben'
print(myName) # Ruben
print(myname) # [...] NameError: name 'myname' is not defined
end program.

Note that you also need use the correct casing for SPSS variable names in Python. Avoiding all upper case for SPSS variable names tends to simplify things.

Check Object Types in Python

In SPSS, “data types” mostly refer to variable types. Keep in mind that “variables” in SPSS are columns of cells that hold data values. There's only 2 SPSS variable types:

These 2 types determine what you can (not) do with a variable in SPSS. Confusingly, the Type column in variable view suggests that there's many more variable types but these are formats, not types as explained in SPSS Variable Types and Formats.

SPSS String Versus Numeric Variable In Variable View

In Python, we'll also encounter strings and numbers but these are called objects instead of variables and only hold a single value. An entire column of values is usually represented as a Python list object or a Python tuple. And these are different object types than the strings or integers they may contain.

Like so, there's many different Python object types which we briefly cover in Quick Overview Python Object Types. Just as with SPSS variable types, Python object types determine what you can (not) do with them and how. That's why you sometimes need to check what type of object you're dealing with. This is simply done with print(type(myObject)) as shown below.

*FIND WHAT TYPE OF OBJECT YOU'RE DEALING WITH.

begin program python3.
myName = 'Ruben'
print(type(myName)) # <class 'str'>
end program.

Note: Python users can create new object types known as Classes. Such classes typically have their own methods and properties. You may need a bit of study for dealing with some of them.

Indentation Matters in Python

Basic looping in SPSS is done with

The syntax below briefly illustrates this structure.

*SPSS DO REPEAT EXAMPLE.

do repeat #vars = v01 to v05.
compute #vars = 0.
end repeat.

In Python, the start of a loop is indicated by a for or while statement. After that, the indentation of the lines that follow indicate where a loop ends. The examples below illustrate how this works.

*LOOP OVER 2 PRINT STATEMENTS.

begin program python3.
for ind in range(10):
    print('COMPUTE V{} = {}.'.format(ind,ind))
    print("VARIABLE LABELS V{} 'Mean Job Satisfaction Score'.".format(ind))
end program.

*LOOP OVER 1 PRINT STATEMENT.

begin program python3.
for ind in range(10):
    print('COMPUTE V{} = {}.'.format(ind,ind))
print("VARIABLE LABELS V{} 'Mean Job Satisfaction Score'.".format(ind))
end program.

This means that in order for your Python code to function properly, you must apply the correct indentation levels. This goes for

Note that we cover most of these in Conditions and Loops in Python.

Python Assignment or Comparison Operator

In SPSS syntax, the = operator is used for

The syntax below shows 2 minimal examples.

*USE = FOR ASSIGNMENT IN SPSS.

compute m01 = mean(v01 to v05).

*USE = FOR COMPARISON IN SPSS.

do if (gender = 0).
...
end if.

In contrast, Python uses

The examples below briefly illustrate the difference.

*DIFFERENCE = VERSUS ==.

begin program python3.
myAction = 'PRINT' # = assigns value to object
if myAction == 'PRINT': # == compares values
        print('FREQUENCIES ALL.')
end program.

For an overview of these and many other Python operators, read up on Python Operators - Quick Overview & Examples.

Escape Sequences in SPSS and Python

In SPSS, we rarely use escape sequences but 2 main exceptions are

Few SPSS users are aware of such escape sequences but running the syntax below nicely illustrates both.

*SET UP EMPTY DATA.

data list free/v01.
begin data.
6
end data.

*ESCAPE SINGLE QUOTE BY DOUBLING IT.

add value labels v01 6 'Don''t know'.

*ESCAPE "n" BY BACKSLASH.

variable labels v01 'This label continues on\nline 2 and\nline 3'.

*SHOW LABELS IN SUCCEEDING OUTPUT TABLES.

set tnumbers both tvars both.

*QUICK CHECK.

frequencies v01.

Result

Escape Sequences In SPSS Output Table

So those are basic escape sequences in SPSS syntax. Now, what about Python?

In Python, escaping is always done with a backslash. A backslash itself is also escaped by a backslash. Alternatively, specify a string as a raw string by preceding it with “r”. The syntax below shows some basic examples.

*ESCAPE SINGLE QUOTE BY BACKSLASH.

begin program python3.
vallab = 'Don\'t know'
print(vallab)
end program.

*ESCAPE BACKSLASH WITH BACKSLASH.

begin program python3.
rDir = 'D:\\analyses_example\\new-data'
print(rDir)
end program.

*ESCAPE BACKSLASH BY RAW STRING.

begin program python3.
rDir = r'D:\analyses_example\new-data'
print(rDir)
end program.

Comments in SPSS and Python

In SPSS we usually comment entire lines of syntax by

In rare cases, we may enter a comment within some command enclosed by /* and */ as shown below.

*THIS ENTIRE LINE IS A COMMENT IN SPSS SYNTAX.

frequencies all /* AND HERE'S A COMMENT WITHIN A COMMAND */
/barchart.

In Python, everything between a hashtag and the end of the line in which it occurs is a comment. Alternatively, enclose a (multiline) comment by triple quotes.

*FROM # THROUGH END OF LINE IS PYTHON COMMENT.

begin program python3.
#FIRST GREET AUDIENCE BEFORE PROCEEDING
print('Hello!') #DONE
end program.

*PYTHON COMMENT BETWEEN TRIPLE SINGLE QUOTES.

begin program python3.
'''
This is a multiline
comment enclosed by
triple single quotes.
'''
print('Hello!')
end program.

Thanks for reading!