 SPSS TUTORIALS BASICS ANOVA REGRESSION FACTOR CORRELATION

# Overview Python String Methods

An important part of any programming language are string manipulations. In Python, these are known as string methods. The table below gives a quick overview.

## Overview Python String Methods

WHATPYTHONReturnsPYTHON EXAMPLESPSS EXAMPLE
Extract Substring[]StringmyStringcompute str01 = char.substr(str01,1,1).
Concatenate 2(+) Strings+ or +=StringmyString + myStringcompute str01 = concat(str01,str02).
Find Leftmost Occurrence of SubstringfindIntegermyString.find('a')compute pos = char.index(str01,'a').
Find Rightmost Occurrence of SubstringrfindIntegermyString.rfind('a')compute pos = char.rindex(str01,'a').
Replace 1(+) CharactersreplaceStringmyString.replace('a','b')compute str01 = replace(str01,'a','b').
Find Length of StringlenIntegerlen(myString)compute len01 = char.length(str01).
Lowercase StringlowerStringmyString.lower()compute str01 = lower(str01).
Uppercase StringupperStringmyString.upper()compute str01 = upper(str01).
Capitalize StringcapitalizeStringmyString.capitalize()(None)
Remove Characters from Left Part of Stringlstrip()StringmyString.lstrip()compute str01 = ltrim(str01).
Remove Characters from Right Part of Stringrstrip()StringmyString.rstrip()compute str01 = rtrim(str01).
Remove Characters from Left and Right Part of Stringstrip()StringmyString.strip()(None)
Convert String to IntegerintIntegerint(myString)compute num01 = number(str01,comma16).
(Or use ALTER TYPE.)
Split String into Python ListsplitListmyString.split(' ')(None)
Check if String Starts With...startswithBooleanmyString.startswith("var")(None)
Check if String Ends With...endswithBooleanmyString.endswith("var")(None)

## Extract Substring in Python

We extract substrings in Python with square brackets that may contain one or two indices and a colon. Like so,

• `myString` extracts the first character;
• `myString[1:]` extracts the second through last characters;
• `myString[:4]` extracts the first through fourth characters;
• `myString[1:3]` extracts the second through third characters;
• `myString[-1]` extracts the last character.

## Python Substring Examples

*SPSS PYTHON SUBSTRING EXAMPLES.

begin program python3.
myString = 'abcdefghij'
print(myString) # a
print(myString[1:]) # bcdefghij
print(myString[:4])# abcd
print(myString[1:3]) # bc
print(myString[-1]) # j
end program.

## Concatenating Strings in Python

Basically, + concatenates two or more strings. Additionally, myString += 'a' is a nice shorthand for myString = myString + 'a' that we'll often use for building SPSS syntax.

## Python Concatenate Examples

*1. CONCATENATE WITH "+".

begin program python3.
myString = 'abc'
print(myString + 'def') #abcdef
end program.

*2. CONCATENATE WITH "+="

begin program python3.
myString = 'abc'
for i in range(5):
myString += str(i)
print(myString) #abc01234
end program.

Note: in these examples, we're technically creating new string objects rather than truly changing existing string objects. This is because strings are immutable in Python.

## Find Leftmost Occurrence of Substring

Retrieving positions for single or multiple character substrings is done in Python with `find`. Keep in mind here that

• Python is fully case sensitive and
• Python objects are zero-indexed.

Like so, the indices for the characters in our examples are shown below. ## Python Find Examples

*FIND LEFTMOST OCCURRENCE OF SUBSTRING.

begin program python3.
myString = 'Cycling in the mountains is fun.'
print(myString.find('c')) # 2
print(myString.find('in')) # 4
end program.

## Find Rightmost Occurrence of Substring

In Python, `rfind` returns the index (again, starting from zero) for the rightmost occurrence of some substring in a string. The syntax below shows a couple of examples.

*FIND RIGHTMOST OCCURRENCE OF SUBSTRING.

begin program python3.
myString = 'Cycling in the mountains is fun.'
print(myString.rfind('i')) # 25
print(myString.rfind('in')) # 21
end program.

## Replacing Characters in a Python String

Replacing characters in a string is done with `replace` in Python as shown below.

*Replace one or more characters in string.

begin program python3.
myString = 'The cat caught the mouse in the living room.'
print(myString.replace('a','')) #The ct cught the mouse in the living room.
print(myString.replace('the','a')) # The cat caught a mouse in a living room.
end program.

Note: in line 5 we replace all a’s with an empty string. That is, we'll remove all a’s from our example sentence.

## Find Length of Python String

In Python, `len` returns the number of characters (not bytes) of some string object.

*FIND LENGTH OF STRING.

begin program python3.
myString = 'abcde'
print(len(myString)) # 5
end program.

## Convert Python String to Lowercase

For converting a Python string to lowercase, use `lower` as shown below.

*LOWERCASE STRING.

begin program python3.
myString = 'SPSS Is Fun!'
print(myString.lower()) # spss is fun!
end program.

## Convert Python String to Uppercase

In Python, `upper` converts a string object to uppercase.

*UPPERCASE STRING.

begin program python3.
myString = 'This is Some Title'
print(myString.upper()) # THIS IS SOME TITLE
end program.

## Capitalize Python String Object

In Python, “capitalizing” means returning a string with its first character in uppercase and all other characters in lowercase -even if they were uppercase in the original string.

*CAPITALIZE STRING.

begin program python3.
myString = 'aBcDeF'
print(myString.capitalize()) # Abcdef
end program.

## Remove Characters from Left Part of String

In Python, just `lstrip()` removes all spaces and tabs from the beginning of a string. Any other leading character can be removed by specifying it within the parentheses (line 12 below).

*REMOVE WHITESPACE FROM START OF STRING.

begin program python3.
myString = '    left padding removed'
end program.

*REMOVE ASTERISKS (*) FROM START OF STRING.

begin program python3.
end program.

## Remove Characters from Right Part of String

The `rstrip` method works the same as lstrip but removes characters from the right side of some string.

*REMOVE WHITESPACE FROM END OF STRING.

begin program python3.
myString = 'right padding removed    '
end program.

*REMOVE ASTERISKS (*) FROM END OF STRING.

begin program python3.
end program.

## Remove Characters from Left and Right Part of String

Just `strip` basically combines the Python lstrip and rstrip methods.

*REMOVE WHITESPACE FROM END OF STRING.

begin program python3.
myString = '    left and right padding removed    '
print(myString.strip()) # left and right padding removed
end program.

*REMOVE ASTERISKS (*) FROM END OF STRING.

begin program python3.
myString = '****left and right padding removed****'
print(myString.rstrip('*')) # left and right padding removed
end program.

Sadly, this method doesn't have an SPSS equivalent, which is why we sometimes see LSTRIP(RSTRIP(MYSTRING)) in older syntax. Note that whitespace is often stripped automatically from string values in SPSS Unicode mode.

## Convert String to Integer

In Python, `int` converts a string to an integer. If a string contains anything else than digits, it'll crash with an error.

*CONVERT STRING TO INTEGER.

begin program.
myString = '123'
myInt = int(myString)
print(type(myInt)) # <type 'int'>
print(myInt) # 123
end program.

## Split Python String into List Object

The example below splits a string into a Python list object. `split` always requires some separator. Splitting a string without any separator can be done with a list comprehension (line 14 below).

*SPLIT STRING INTO PYTHON LIST OBJECT.

begin program python3.
myString = 'A A C A B C'
myList = myString.split(' ')
print(type(myList)) # <type 'list'>
print(myList) # ['A', 'A', 'C', 'A', 'B', 'C']
end program.

*SPLIT STRING INTO PYTHON LIST WITHOUT SEPARATOR.

begin program python3.
myString = 'AACABC'
myList = [i for i in myString]
print(myList) # ['A', 'A', 'C', 'A', 'B', 'C']
end program.

## Check if String Starts With...

*EVALUATE IF STRING STARTS WITH GIVEN SUBSTRING.

begin program python3.
myString = 'abcdef'
print(myString.startswith('abc')) # True
print(myString.startswith('bcd')) # False
end program.

*TYPICAL USE OF STARTSWITH().

begin program python3.
if myString.startswith('a'):
print("First character is 'a'.")
else:
print("First character is not 'a'.")
end program.

Note: True and False are the (only) 2 possible values for Booleans. We mostly use them when we only want to run one or Python if statements.

## Check if String Ends With...

*EVALUATE IF STRING ENDS WITH GIVEN SUBSTRING.

begin program python3.
myString = 'abcdef'
print(myString.endswith('f')) # True
print(myString.endswith('e')) # False
end program.

## Left Pad String with Zeroes

In Python, `zfill(3)` left pads a string with zeroes up to a total length of 3 characters. We mostly do so when we want to sort numbers alphabetically: 002 comes before 010 and so on.

begin program python3.
myString = '1'
print(myString.zfill(3)) # 001
myString = '10'
print(myString.zfill(3)) # 010
end program.

So that's about it for Python string methods. I hope you found this tutorial helpful.