Converting an SPSS string variable into a numeric one is simple. However, there's a huge **pitfall** that few people are aware of: string values that can't be converted into numbers result in system missing values without SPSS throwing any error or warning.

This **can mess up your data** without you being aware of it. Don't believe me? I'll demonstrate the problem -and the solution- on convert-strings.sav, part of which is shown below.

## SPSS Strings to Numeric - Wrong Way

First off, you *can* convert a string into a numeric variable in variable view as shown below.

Now, I never use this method myself because

- I can't apply it to
**many variables**at once, so it may take way more effort than necessary; - it doesn't generate any syntax: there's no button and nothing's appended to my journal file;
- it can
**mess up the data**. However, there's remedies for that.

## So What's the Problem?

Well, let's *do it* rather than read about it. We'll

- set empty cells as user missing values for s3;
- convert s3 to numeric in variable view;
- run descriptives on the result.

***Set empty string as user missing value for s3.**

missing values s3 ('').

***Inspect frequency table for s3.**

frequencies s3.

***Now manually convert s3 to numeric under variable view.**

***Inspect result.**

descriptives s3.

***N = 444 instead of 459. That is, 15 values failed to convert and we've no clue why.**

## Result

Note that some values in our string variable have been flagged with “a”. We probably want these to be converted into numbers. We have **459 valid values** (non empty cells).

After converting our variable to numeric, we ran some descriptives. Note that we only have N = 444. Apparently, **15 values failed to convert** -probably not what we want. And we usually **won't notice this problem** because we don't get any warning or error.

## Conversion Failures - Simplest Solution

Right, so how can we **perform the conversion safely**? Well, we just

- inspected
**frequency tables**: how many non empty values do we have before the conversion? **converted**our variable(s) to numeric;- inspected N in a
**descriptive statistics**after the conversion. If N is lower than the number of non empty string values (frequencies before conversion), then something may be wrong.

In our first example, the frequency table already suggested we must **remove the “a”** from all values before converting the variable. We'll do just that in a minute.

Although safe, I still think this method is too much work, especially for multiple variables. **Let's speed things up** by using some handy syntax.

## SPSS - String to Numeric with Syntax

The fastest way to convert string variables into numeric ones is with the ALTER TYPE command.This requires SPSS version 16 or over. For SPSS 15 or below, use the NUMBER function. It allows us to **convert many variables with a single line** of syntax.

The syntax below converts all string variables in one go. We then check a descriptives table. **If we don't have any system missing values, we're done**.

## SPSS ALTER TYPE Example

***Close data without saving and reopen before proceeding.**

***Convert all variables in one go.**

alter type s1 to s3 (f1) s4 (f6.3).

***Inspect descriptives.**

descriptives s1 to s4.

Note: using `alter type s1 to s4 (f1).`

will also work but the decimal places for s4 won't be visible. This is why we set the correct f format: `f6.3`

means 6 characters including the decimal separator and 3 decimal places as in 12.345. Which is the format of our string values.

## Result

Since we've 480 cases in our data, we're done for s1. However, the other 3 variables contain system missings so we **need to find out why**. Since we can't undo the operation, let's close our data without saving and reopen it.

## Solution 2: Copy String Variables Before Conversion

Things now become a bit more technical. However, readers who struggle their way through will learn a **very efficient solution** that works for many other situations too. We'll basically

- copy all string variables;
- convert all string variables;
- compare the original to the converted variables.

Precisely, we'll **flag non empty string values that are system missing** after the conversion. As these are at least suspicious, we'll call those conversion failures. This may sound daunting but **it's perfectly doable** if we use the right combination of commands. Those are mainly STRING, RECODE, DO REPEAT and IF.

## Copy and Convert Several String Variables

***Close data without saving and reopen before proceeding.**

***Copy all string variables.**

string c1 to c4 (a7).

recode s1 to s4 (else = copy) into c1 to c4.

***Convert variables to numeric.**

alter type s1 to s3 (f1) s4 (f6.3).

***For each variable, flag conversion failures: cases where converted value is system missing but original value is not empty.**

do repeat #conv = s1 to s4 / #ori = c1 to c4 / #flags = flag1 to flag4.

if(sysmis(#conv) and #ori <> '') #flags = 1.

end repeat.

***If N > 0, conversion failures occurred for some variable.**

descriptives flag1 to flag4.

## Result

Only flag3 and flag4 contain some conversion failures. We can visually inspect what's the problem by moving these cases to the top of our dataset.

***Visually inspect why values fail to convert.**

sort cases by flag3 (d).

***Some values flagged with 'a'.**

sort cases by flag4 (d).

***Some values flagged with 'a' through 'e'.**

## Result

## Remove Illegal Characters, Copy and Convert

Some values are flagged with letters “a” through “e”, which is why they fail to convert. We'll now fix the problem. First, we close our data without saving and reopen it. We then rerun our previous syntax but remove these letters before the conversion.

## Syntax

***Close data without saving and reopen before proceeding.**

***Copy all stringvars.**

string c1 to c4 (a7).

recode s1 to s4 (else = copy) into c1 to c4.

***Remove 'a' from s3.**

compute s3 = replace(s3,'a','').

***Remove 'a' through 'e' from s4.**

do repeat #char = '

**a**' '

**b**' '

**c**' '

**d**' '

**e**'.

compute s4 = replace(s4,#char,'').

end repeat.

***Try and convert variable again.**

alter type s1 to s3 (f1) s4 (f6.3).

***Flag conversion failures again.**

do repeat #conv = s1 to s4 / #ori = c1 to c4 / #flags = flag1 to flag4.

if(sysmis(#conv) and #ori <> '') #flags = 1.

end repeat.

***Inspect if conversion succeeded.**

descriptives flag1 to flag4.

***N = 0 for all flag variables so we're done.**

***Delete copied and flag variables.**

delete variables c1 to flag4.

## Result

All flag variables contain only (system) missings. This means that we no longer have any conversion failures; all **variables have been correctly converted**. We can now delete all copy and flag variables, save our data and move on.

Thanks for reading!

## THIS TUTORIAL HAS 12 COMMENTS:

## By Yaser Pourdavar on April 18th, 2016

Hi Ruben

Thanks.

## By William Peck on November 2nd, 2018

Good one! I'm coming from a programmer's perspective, so the hands on SPSS exercises are helpful and I'm getting the point, especially in this one. The core statistics stuff is kind of making my head spin, having taken a Prob and Stats course like 30 years ago ...

## By Ruben Geert van den Berg on November 2nd, 2018

I think most of the older educational material is disastrous. Written by and for mathematicians and -surprise!- completely unsuitable for social scientists.

Generally, I feel social scientists are smart enough to understand statistics. However, if you tell the story in a language they don't master -mathematical formulas- then you can't blame them for not getting it.

And that's exactly what's still happening in most universities. I'd love to come up with a complete course based on visualizations and simple language examples but writing it takes forever and I've no means to pay my bills in the meantime :-/

## By William Peck on November 2nd, 2018

I was going to jump into an IBM 2-day course, but they I started looking around online and there's a whole lot of SPSS materials. I just started with this and going to keep going ...

## By C. Hendrix Grupee on November 21st, 2018

The instructions are help but it's challenging for rookie in statistics, like me, trying to complete my data analysis for my dissertation. The SPSS version that I used in my research courses was 21 now updated to version 25. I didn't learn about syntax so it's difficult for me to grips it readily.

I imported my data from Excel and some of them came up in SPSS as string variables and I use the tab in SPSS to convert to numeric variables so my analysis is not producing any reasonable output.

I need help!