By Ruben Geert van den Berg on April 18, 2017 under SPSS Python Basics Tutorials.

Python – The 5 Things You Want to Know

SPSS Python Import Error Due to Wrong Casing

SPSS users who want to speed up their work by using Python will encounter some surprises. This tutorial walks you through the 5 major pitfalls and shows how to avoid them.

1. Python is Fully Case Sensitive

SPSS is mostly case insensitive; if we have a variable “gender”, we can address it in syntax as gender or GENDER or anything in between. On top of that, we can't have two variables gender and GENDER in SPSS because they'd be seen as the same variable.
In Python, none of the above holds. As it's fully case sensitive, we must always use the exact right casing for all objects. This is especially tricky when importing modules or using methods: they don't seem to exist if we don't use the correct casing.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
*Wrong casing for module.

begin program.
import spss,spssclient # ImportError: No module named spssclient
end program.

*Correct casing.

begin program.
import spss,SpssClient
end program.

*Wrong casing for attribute.

begin program.
import spssaux
sDict = spssaux.Variabledict() # AttributeError: 'module' object has no attribute 'Variabledict'
end program.

*Correct casing.

begin program.
import spssaux
sDict = spssaux.VariableDict()
end program.

Result

SPSS Python Import Error Due to Wrong Casing

2. Indentation Matters in Python

In many computer languages -SPSS syntax, Javascript, CSS, PhP, HTML and more- indentation is optional and mainly used for making code more readable. In Python, however, indentation indicates where Python loops and Python-if clauses end. The very simple examples below illustrate how it works.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
*Print "hello" and "bye" 5 times in loop.

begin program.
for i in range(5):
    print "hello"
    print "bye" #Indented so still in loop
end program.

*Print "hello" " 5 times in loop and "bye just once.

begin program.
for i in range(5):
    print "hello"
print "bye" #Not indented so loop has ended
end program.

Python Indentation in SPSS

We could indent lines with 1 or 2 spaces but for some reason, 4 spaces is most common. We can set this in SPSS by navigating to Edit SPSS Menu Arrow Options SPSS Menu Arrow Syntax editor SPSS Menu Arrow Indent size and setting it to 5.SPSS seems to insert the number of spaces - 1, which we think may be a minor bug.
Pressing tab in the Syntax Editor window now results in 4 spaces.

SPSS Tab Setting 4 Spaces

Note that Notepad++ -recommended for writing larger block of Python code- has a similar setting and some very handy shortkeys for indenting or outdenting entire sections.

3. Comment Your Code

For SPSS syntax as well as Python, adding comments to your code is a great idea. In Python, all code between a hash tag (#) and the end of a line is seen as a comment. By default, Notepad++ shows Python comments in green as shown below.

Python Comments in SPSS

4. Print Objects and their Types

If you create Python-objects such as strings or Python lists yourself, you'll probably know their object types and what they contain. This allows you to work with them in a goal directed manner.
However, if we retrieve objects from SPSS such as value labels or data values, we're not always sure how they end up in Python. The solution is to run print object and print type(object). The example below -using employees.sav- illustrates how this helps us out.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
*Look up value labels for job satisfaction.

begin program.
import spssaux
sDict = spssaux.VariableDict()
vallabs = sDict['job_satisfaction'].ValueLabels
print type(vallabs) # <type 'dict'>
end program.

*Since vallabs = Python dict object, we can retrieve key-value pairs with iteritems() method.

begin program.
for key,val in vallabs.iteritems():
    print key,val
end program.

5. Be Careful with Backslashes

Python uses the backslash (\) as an escape character in strings. This may yield unexpected results if we're not aware of it. For example, we can't specify the path to an SPSS file as somepath = 'c:\newdata\data.sav' because “\n” inserts a line break into our string rather than “\n”. One solution is to prefix the entire string with “r”, which is short for raw string. Therefore, somepath = r'c:\newdata\data.sav' will work as we intended.

Raw Strings in Python

The examples below show some wrong and correct uses of backslashes in Python.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
*Wrong way: \n indicates new line.

begin program.
somepath = 'c:\newdata\data.sav'
print somepath
end program.

*Right way: \n in raw string is just \n.

begin program.
somepath = r'c:\newdata\data.sav'
print somepath
end program.

*Wrong way: second quote ends string prematurely.

begin program.
print 'I don't know!'
end program.

*Right way: \ escapes second quote.

begin program.
print 'I don\'t know!'
end program.

Right, I guess these are the major pitfalls in Python. If you agree or disagree or if I forgot to mention something, please let me know by dropping a comment below or otherwise, ok?

Thanks for reading!

Related Tutorials

Python for SPSS – What is It?

Some larger or more complex SPSS tasks may seem daunting at first. However, they can usually be accomplished with surprisingly little time and effort. The basic trick here is to have Python create and execute the necessary syntax for you. This tutorial briefly introduces Python and its relation to SPSS. Read more

Introducing Python 6 – Four Tips

Many ready-to-use SPSS Python tools are available for SPSS users with limited or no knowledge of Python. However, at some point most users will continue building upon their expertise; next steps may include learning to modify Python code for SPSS and eventually write it from scratch. The most helpful tips for this stage are discussed in this tutorial. Read more

Comment on this Tutorial

*Required field. Your comment will show up after approval from a moderator.

This Tutorial has 3 Comments