SPSS Tutorials

BASICS REGRESSION T-TEST ANOVA CORRELATION

SPSS Python String Tutorial

The most important Python object we'll deal with are strings. This tutorial presents a quick overview of Python string methods. However, let's first learn some very basics.

Python Strings - Basic Rules

There's some more rules but these 3 will do for practical purposes. The 3 examples below demonstrate them.

Create Empty Python String

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
*Create empty string object - single line.

begin program.
myString = '' #Create empty string
print myString #(Empty line)
print type(myString) #<type 'str'>
end program.

*Create multiple line string.

begin program.
myString = '''
T-TEST GROUPS=gender(0 1)
/MISSING=ANALYSIS
/VARIABLES=amount_spent
/CRITERIA=CI(.95).
'''
print myString
end program.

*Escape single quote in string.

begin program.
myString = 'I don\'t know!'
print myString #I don't know!
end program.

Note: creating an empty string sounds pretty useless. However, we can concatenate lines of SPSS syntax or variable names to our empty string and this is often a nice way to build our syntax.

Overview Python String Methods

What?How?ExampleReturns
Extract Substring[]myString[0]String
Concatenate 2(+) Strings+ or +=myString + myStringString
Find Leftmost Occurrence of SubstringfindmyString.find('a')Integer
Find Rightmost Occurrence of SubstringrfindmyString.rfind('a')Integer
Replace 1(+) CharactersreplacemyString.replace('a','b')String
Find Length of Stringlenlen(myString)Integer
Lowercase StringlowermyString.lower()String
Uppercase StringuppermyString.upper()String
Capitalize StringcapitalizemyString.capitalize()String
Remove Characters from Left Part of Stringlstrip()myString.lstrip()String
Remove Characters from Right Part of Stringrstrip()myString.rstrip()String
Remove Characters from Left and Right Part of Stringstrip()myString.strip()String
Convert String to Integerintint(myString)Integer
Split String into Python ListsplitmyString.split(' ')List
Check if String Starts With...startswithmyString.startswith("var")Boolean
Check if String Ends With...startswithmyString.startswith("var")Boolean
Left Pad String with ZeroeszfillmyString.zfill(3)String

Extract Substring in Python

We extract substrings in Python with square brackets that may contain one or two indices and a colon. Like so,

Python Substring Examples

1
2
3
4
5
6
7
8
9
10
*SPSS Python substring examples.

begin program.
myString = 'abcdefghij'
print myString[0] #a
print myString[1:] #bcdefghij
print myString[:4] #abcd
print myString[1:3] #bc
print myString[-1] #j
end program.

Concatenate 2(+) Strings

Basically, + concatenates two or more strings. Additionally, myString += 'a' is a nice shorthand for myString = myString + 'a' that we'll often use for building SPSS syntax.

Python Concatenate Examples

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
*1. Concatenate with "+".

begin program.
myString = 'abc'
print myString + 'def' #abcdef
end program.

*2. Concatenate with "+="

begin program.
myString = 'abc'
for i in range(5):
    myString += str(i)
print myString #abc01234
end program.

Find Leftmost Occurrence of Substring

Retrieving positions for single or multiple character substrings is done in Python with find. Keep in mind here that

Like so, the indices for the characters in our examples are shown below.

Python character indices

Python Find Examples

1
2
3
4
5
6
7
*Find leftmost occurrence of substring.

begin program.
myString = 'Cycling in the mountains is fun.'
print myString.find('c') # 2
print myString.find('in') # 4
end program.

Find Rightmost Occurrence of Substring

1
2
3
4
5
6
7
*Find rightmost occurrence of substring.

begin program.
myString = 'Cycling in the mountains is fun.'
print myString.rfind('i') # 25
print myString.rfind('in') # 21
end program.

Replace 1(+) Characters

1
2
3
4
5
6
7
*Replace one or more characters in string.

begin program.
myString = 'The cat caught the mouse in the living room.'
print myString.replace('a','') #The ct cught the mouse in the living room.
print myString.replace('the','a') # The cat caught a mouse in a living room.
end program.

Note: in line 5 we replace all a’s with an empty string. That is, we'll remove all a’s from our example sentence.

Find Length of String

1
2
3
4
5
6
*Find length of string.

begin program.
myString = 'abcde'
print len(myString) # 5
end program.

Lowercase String

1
2
3
4
5
6
*Lowercase string.

begin program.
myString = 'SPSS Is Fun!'
print myString.lower() # spss is fun!
end program.

Uppercase String

1
2
3
4
5
6
*Uppercase string.

begin program.
myString = 'This is Some Title'
print myString.upper() # THIS IS SOME TITLE
end program.

Capitalize String

In Python, “capitalizing” means returning a string with its first character in upper case and all other characters in lower case -even if they were upper case in the original string.

1
2
3
4
5
6
*Capitalize string.

begin program.
myString = 'aBcDeF'
print myString.capitalize() # Abcdef
end program.

Remove Characters from Left Part of String

In Python, just lstrip() removes all spaces and tabs from the beginning of a string. Any other leading character can be removed by specifying it within the parentheses (line 12 below).

1
2
3
4
5
6
7
8
9
10
11
12
13
*A. Remove whitespace from start of string.

begin program.
myString = '    left padding removed'
print myString.lstrip() # left padding removed
end program.

*B. Remove asterisks (*) from start of string.

begin program.
myString = '****left padding removed'
print myString.lstrip('*') # left padding removed
end program.

Remove Characters from Right Part of String

1
2
3
4
5
6
7
8
9
10
11
12
13
*A. Remove whitespace from end of string.

begin program.
myString = 'right padding removed    '
print myString.rstrip() # right padding removed
end program.

*B. Remove asterisks (*) from end of string.

begin program.
myString = 'right padding removed****'
print myString.rstrip('*') # right padding removed
end program.

Remove Characters from Left and Right Part of String

1
2
3
4
5
6
7
8
9
10
11
12
13
*A. Remove whitespace from end of string.

begin program.
myString = '    left and right padding removed    '
print myString.strip() # left and right padding removed
end program.

*B. Remove asterisks (*) from end of string.

begin program.
myString = '****left and right padding removed****'
print myString.rstrip('*') # left and right padding removed
end program.

Convert String to Integer

In Python, int converts a string to an integer. If a string contains anything else than digits, it'll crash with an error.

1
2
3
4
5
6
7
8
*Convert String to Integer.

begin program.
myString = '123'
myInt = int(myString)
print type(myInt) # <type 'int'>
print myInt # 123
end program.

Split String into Python List

The example below splits a string into a Python list object. split always requires some separator. Splitting a string without any separator can be done with a list comprehension (line 14 below).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
*A. Split string into Python list object.

begin program.
myString = 'A A C A B C'
myList = myString.split(' ')
print type(myList) # <type 'list'>
print myList # ['A', 'A', 'C', 'A', 'B', 'C']
end program.

*B. Split string into Python list without separator.

begin program.
myString = 'AACABC'
myList = [i for i in myString]
print myList # ['A', 'A', 'C', 'A', 'B', 'C']
end program.

Check if String Starts With...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
*Evaluate if string starts with given substring.

begin program.
myString = 'abcdef'
print myString.startswith('abc') # True
print myString.startswith('bcd') # False
end program.

*Typical use of startswith().

begin program.
if myString.startswith('a'):
    print "First character is 'a'."
else:
    print "First character is not 'a'."
end program.

Note: True and False are known as Boolean values. We mostly use them when we only want to run one or Python if statements.

Check if String Ends With...

1
2
3
4
5
6
7
*Evaluate if string ends with given substring.

begin program.
myString = 'abcdef'
print myString.endswith('f') # True
print myString.endswith('e') # False
end program.

Left Pad String with Zeroes

In Python, zfill(3) left pads a string with zeroes up to a total length of 3 characters. We mostly do so when we want to sort numbers alphabetically: 002 comes before 010.

1
2
3
4
5
6
7
8
*Left pad string with zeroes.

begin program.
myString = '1'
print myString.zfill(3) # 001
myString = '10'
print myString.zfill(3) # 010
end program.

Previous tutorial: Python – The 5 Things You Want to Know

Next tutorial: SPSS Python Text Replacement Tutorial

Let me know what you think!

*Required field. Your comment will show up after approval from a moderator.