A substring is a subset of characters from a string. Extracting substrings in SPSS is done with CHAR.SUBSTR (SPSS versions 16+) or just SUBSTR (SPSS versions 15-). CHAR.SUBSTR takes two or three arguments as shown by the minimal example below.
SPSS CHAR.SUBSTR - Minimal Example
COMPUTE var_2 = CHAR.SUBSTR(var_1,3,2).
The three arguments mean the following:
- var_1 denotes the variable from which the substring is taken;
- 3 is the first character that's extracted;
- 2 is the number of characters to extract.
Altogether, this first example implies that var_2 will consist of characters 3 and 4 of var_1.
SPSS Substring Syntax Examples
The examples below use webdesigners.sav.
cd 'C:\xampp\htdocs\spss-tutorials\wp-content\themes\spss-tutorials-10\dont_upload\@external files\SPSS\test_data_creation\webdesigners'.
get file 'webdesigners.sav'.
string fname lname company tld (a30).
compute fname = char.substr(email,1,1).
execute.
compute fname = char.substr(email,3,2).
execute.
compute fname = char.substr(email,1,char.index(email,'.') - 1).
execute.
compute fname = concat(upper(char.substr(fname,1,1)),char.substr(fname,2)).
execute.
compute lname = char.substr(email,char.index(email,'.') + 1,char.index(email,'@') - 1 - char.index(email,'.')).
execute.
compute lname = concat(upper(char.substr(lname,1,1)),char.substr(lname,2)).
execute.
compute company = char.substr(email,char.index(email,'@') + 1).
execute.
compute tld = char.substr(company,char.rindex(company,'.') + 1).
execute.
document 'bal'.
document 'bol'.
display documents.
get file 'webdesigners.sav'.
string fname lname company tld (a30).
compute fname = char.substr(email,1,1).
execute.
compute fname = char.substr(email,3,2).
execute.
compute fname = char.substr(email,1,char.index(email,'.') - 1).
execute.
compute fname = concat(upper(char.substr(fname,1,1)),char.substr(fname,2)).
execute.
compute lname = char.substr(email,char.index(email,'.') + 1,char.index(email,'@') - 1 - char.index(email,'.')).
execute.
compute lname = concat(upper(char.substr(lname,1,1)),char.substr(lname,2)).
execute.
compute company = char.substr(email,char.index(email,'@') + 1).
execute.
compute tld = char.substr(company,char.rindex(company,'.') + 1).
execute.
document 'bal'.
document 'bol'.
display documents.
Explanation
- In SPSS, a substring can be extracted by using
CHAR.SUBSTR(a,b,c)
. - Here,
a
refers to the string from which the substring should be taken. - The second argument
b
indicates the starting position ("start at the bth letter") - The third argument
c
is the length of the substring. It may be omitted, in which case all characters after the starting position will be extracted. - As seen in the second example,
a
andb
don't have to be static numbers. They may be replaced by (for example) the position of the last space in a string, which is returned by RINDEX. - The
CHAR
prefix may often be omitted. Exactly when is explained in Unicode mode. - Just
SUBSTRING
can be used for modifying the original string in many cases.This will always work on strings consisting of single byte characters. Again, see Unicode mode. This is shown in the final example.
Python Substring Examples
begin program.
pets = 'Cat Dog Rat'
print pets[4:7]
print pets[pets.rfind(" ") +1:]
end program.
pets = 'Cat Dog Rat'
print pets[4:7]
print pets[pets.rfind(" ") +1:]
end program.
Explanation
- In Python, a substring can be extracted from a string by using square brackets
[]
. The latter enclose the relevant index or indices of the character(s) to be extracted. - This operation is called slicing. (Slicing is used for more than just the substring function. For instance,
mylist[1]
would return the second element from a list called "mylist".) - A range of characters is specified by a colon
:
. - For example,
[1:4]
returns the second through the fourth elements. This is because it uses the start index as given and (the end index - 1). - In a similar vein, if the start index is omitted (as in
[:4]
) it will return the first through the fourth element. - Finally, if the end index is omitted (
[1:]
), the second through the final elements are returned.
THIS TUTORIAL HAS 5 COMMENTS:
By Adam_S on October 25th, 2017
How about you make the example dataset you reference in this short tutorial available. Thanks.
By Ruben Geert van den Berg on October 26th, 2017
Hi Adam, I totally agree so I fixed it right away.
This is one of the ancient tutorials that should be rewritten from scratch -like many others- but I'm not going to have the time for doing so any time soon so that's why the dumb mistake was still there.
For substrings in SPSS, you could also consult SPSS String Variables Tutorial.
Substrings in Python are shown in SPSS Python String Tutorial.
HTH!
By Adam_S on October 26th, 2017
Thank you for getting the example dataset available again. Your site is very helpful, even if some of the tutorials are a little old. Thank you very much.
By Philip Reimers on April 14th, 2020
i´m using spss 26 now. My old Syntax
if ((any(substr(nace,1,2),'41','42','43'))) gruppe=2.
doesn´t work any more with spss 26. What is the correct new Syntax for spss26?
By Ruben Geert van den Berg on April 14th, 2020
Hi Philip!
Precisely what happens if you run the syntax?
One thing I can think of is that your old SPSS version is in Unicode mode and your new version isn't or reversely. For SPSS 16 and higher, you should always use CHAR.SUBSTR instead of just SUBSTR.
Kind regards,
SPSS tutorials