Definition
Scratch variables are temporary helper variables that don't show up in your data. A variable is a scratch variable if (and only if) its name starts with "#".
SPSS Scratch Variable - Introduction
- Say we have a data file holding phone numbers. Now we'd like to extract the area codes into a separate variable. This is done by taking all digits before the "-" sign.
- A very basic approach here is to first evaluate at what position the "-" occurs by using INDEX. Next, we may use this position in a
SUBSTR
function. The syntax example below demonstrates how to do this.
SPSS Syntax Example
*1. Create data.
data list free / phone(a11).
begin data
020-2868825 020-8243613 075-6485421 010-9854183 0249-897201
010-0039658 0249-985638 023-5925133 020-0520029 075-3297331
end data.
*2. Determine position of "-" sign.
compute position = char.index(phone,'-').
exe.
*3. Extract area code by using substring.
string area (a4).
compute area = char.substr(phone,1,position - 1).
exe.
data list free / phone(a11).
begin data
020-2868825 020-8243613 075-6485421 010-9854183 0249-897201
010-0039658 0249-985638 023-5925133 020-0520029 075-3297331
end data.
*2. Determine position of "-" sign.
compute position = char.index(phone,'-').
exe.
*3. Extract area code by using substring.
string area (a4).
compute area = char.substr(phone,1,position - 1).
exe.
SPSS Scratch Variable - Example
- After running the first example, we'll have the area codes separated. However, we didn't want to have "position" in our data as well. It's merely a helper variable for taking the correct substring so we can delete it afterwards.
- A slightly cleaner solution is to use "#position" instead. Since this is a scratch variable, it won't show up in our data and there's no need for deleting it. The syntax below demonstrates this.
- Note that we'd normally use substitution for achieving this as demonstrated in our SPSS String Variables Tutorial.
SPSS Scratch Variable - Syntax Example
*1. Delete new variables prior to alternative approach.
delete variables position area.
*2. Determine position of "@" sign.
compute #position = char.index(phone,'-').
*3. Extract area code by using substring.
string area (a4).
compute area = char.substr(phone,1,#position - 1).
exe.
delete variables position area.
*2. Determine position of "@" sign.
compute #position = char.index(phone,'-').
*3. Extract area code by using substring.
string area (a4).
compute area = char.substr(phone,1,#position - 1).
exe.
SPSS Scratch Variable - Special Features
- A first special feature of scratch variables is that they don't show up in your data. We already encountered this in the aforementioned example.
- A second special feature is that scratch variables are not reinitialized. This rather technical point means that the calculation of each value starts off from the value of the previous case if the variable doesn't contain values yet. For non scratch variables, the exact same effect is achieved by Cohen’s D - Effect Size for T-Tests which renders it easy to calculate cumulative statistics over cases.Note that the method of choice for calculating cumulative statistics is usually by means of LAG. The first syntax example below shows how to calculate cumulative sums by using a scratch variable.
- Finally, a data pass deletes all scratch variables. For more on data passes, see SPSS Transformation Commands. Scratch variables can't be used in procedures since these always involve a data pass. Running
EXECUTE
also forces a data pass and therefore deletes all scratch variables. This is shown in the second example below.
SPSS Scratch Variable - Syntax Examples
*1. Create mini test dataset.
data list free/id.
begin data
1 2 3 4 5
end data.
*2. The calculation of #sum starts off from the previous value. Therefore, a cumulative sum is calculated like so.
compute #cumulative = sum(id,#cumulative).
compute cumulative = #cumulative.
exe.
*3. Delete 'cumulative' prior to next example.
delete variables cumulative.
*4. The first 'exe' triggers a data pass which deletes #cumulative. This causes the second 'compute' to fail.
compute #cumulative = sum(id,#cumulative).
exe.
compute cumulative = #cumulative.
exe.
data list free/id.
begin data
1 2 3 4 5
end data.
*2. The calculation of #sum starts off from the previous value. Therefore, a cumulative sum is calculated like so.
compute #cumulative = sum(id,#cumulative).
compute cumulative = #cumulative.
exe.
*3. Delete 'cumulative' prior to next example.
delete variables cumulative.
*4. The first 'exe' triggers a data pass which deletes #cumulative. This causes the second 'compute' to fail.
compute #cumulative = sum(id,#cumulative).
exe.
compute cumulative = #cumulative.
exe.
THIS TUTORIAL HAS 1 COMMENT:
By Oscar Ortiz Alvarez on January 15th, 2017
I was programming the Grubb test to detect extreme values and did not know how to create a temporary variable. In STATA it is local.
Thank you very much for your help, it helped me a lot.
Oscar.