Serious SPSS users have probably heard that
using Python in SPSS can dramatically speed up
your daily SPSS tasks.
So what is it and how does it work? This brief tutorial quickly walks you through.
- How does Python relate to SPSS?
- What Is Python Known For?
- Why Should I Use Python in SPSS?
- Where Can I Get Python for SPSS?
- Which are Some SPSS Python Examples?
Introduction
Python is one of the main general programming languages today and was first launched in 1989.
SPSS -short for “statistical package for the social sciences”- is user friendly software for data editing/analysis and statistical procedures. SPSS is much older than Python: it's already been in use since 1968. Originally, Python had nothing whatsoever to do with SPSS. They were simply 2 completely unrelated software packages until roughly 2005.
So How does Python relate to SPSS?
Around 2005, the SPSS developers created software that connects SPSS with Python: the SPSS-Python plugin. This plugin made it possible to
- send Python code to Python from SPSS;
- have Python look up anything in SPSS: variable names or labels, data values, output tables and more;
- use such information to create (large amounts of) custom-made SPSS syntax;
- have Python send such syntax back to SPSS and execute it;
- or, alternatively, have Python directly modify SPSS output tables & charts, syntax or data.
The figure below sketches some of such interactions between SPSS and Python.
What Is Python Known For?
- Python is among the most important programming languages in use today. It is open source software and it may be freely downloaded, used and distributed, even for commercial use.
- Python was deliberately designed to be intuitive and easy to learn, a programming language suitable for non programmers -which probably includes most SPSS users.
- Python easily handles a huge variety of tasks: it can read, write or modify text files, Excel files, MySQL databases, SPSS and much more.
- Python does a couple of things very differently than the bulk of programming languages. Especially its use of indentation for control structures (looping and conditional processing) is original but effective;
- Just like www.spss-tutorials.com, Python was invented and created in Amsterdam, the Netherlands.
Why Should I Use Python in SPSS?
- For some larger SPSS tasks, using Python for SPSS may decimate the amount of time and effort they require.
- Also, Python may drastically decrease the amount of SPSS syntax required by some tasks. Shorter syntax is much easier to read, adjust and correct.
- Some SPSS tasks are not possible at all with basic syntax but are easily accomplished by Python.
- There are no costs associated with using Python in SPSS.
- There's an ever growing number of SPSS extensions freely available that can be used from SPSS’ menu. Most of these require Python to actually run.
Where Can I Get Python for SPSS?
Finally an easy question... Recent SPSS versions are integrated with Python by default. It is located in the Python3 folder in your SPSS installation folder as shown below.
For more details on this, read up on Python for SPSS - How to Use It?
Which are Some SPSS Python Examples?
Some excellent SPSS-Python code is found in many of our SPSS tools:
- SPSS Mean Centering and Interaction Tool is super useful for moderation regression;
- SPSS - Recode with Value Labels Tool recodes values and moves their value labels from the old onto the new values;
- SPSS - Create All Scatterplots Tool runs scatterplots with(out) regression lines and tables for many pairs of variables in one go;
- SPSS - Clean Labels Tool performs text replacements over many value and/or variable labels;
- SPSS Create Dummy Variables Tool creates dummy variables for regression.
Note: when using these tools, you don't immediately see the underlying Python code. However, if you unzip the SPSS extension (.spe) files, you'll find that each of them contains a Python (.py) file that contains the SPSS-Python code being used.
Thanks for reading!
THIS TUTORIAL HAS 20 COMMENTS:
By Daniel Omoko on December 18th, 2015
This is quite insightful. It's my first time to interact with such and I feel I have an idea of what to do. Thank you
By AZHARI on February 16th, 2016
Thank
By Venkatesh gandi on July 29th, 2016
"Some things that people typically want to do in SPSS are not possible at all with native syntax but are no problem for Python.".May i know those things that can't be possible with native syntax.please share that situations and examples. :|
By Ruben Geert van den Berg on July 30th, 2016
Hi Gandi!
For one thing, you can't edit SPSS output with native SPSS syntax. Often requested features such as hiding rows and columns from pivot tables, setting decimal places for numeric output or exact dimensions for charts can't be done with native SPSS syntax.
Then there's additional Python functions and modules for regular expressions, managing MySQL databases, Excel files and other things that are completely absent from SPSS.
Third, there's things that could theoretically be done with native syntax or macros but that require hopelessly inefficient hacks and lots of manual editing. You could think about looking up dictionary information such as variable names, value labels and variable types.
Search our website for "Python" and you'll find tons of examples. Some of those can't be done at all with native syntax and others can't reasonably be done with just native syntax.
By Mukti Subedi on September 7th, 2016
Very easy to follow along. Thanks, Great source of information