In SPSS, “procedures” refer to all SPSS commands that read the data and are carried out immediately when run.
SPSS Procedures - What and Why
So what is meant by “SPSS procedures” and why does it matter if a command is a procedure, transformation or other command? This tutorial will briefly address these two questions, demonstrating some stuff on age_income.sav. The screenshot below shows what the data look like in data view after opening it.
Now, perhaps we'd like to see some basic DESCRIPTIVES for income. We can generate those by running the following syntax: descriptives income. The screenshot below shows the result in the output viewer window.
Now, how can SPSS present us with these statistics? It does so by what's known as as “reading the data” or a data pass in SPSS.
SPSS Data Passes
In SPSS, a data pass refers to the process of SPSS going through all cases in the data (from top to bottom) in order to collect the values on one or more variables. The figure below illustrates the process.
Now if we consult the command syntax reference on DESCRIPTIVES, we'll encounter the following statement: This command reads the active dataset and causes execution of any pending commands. The first part of the statement indicates that running DESCRIPTIVES triggers a data pass. The “pending commands” refer to any transformations that were run but not yet executed when a procedure is run. We'll demonstrate this below.
In the data at hand, all incomes are stated in Euros. But what if we'd like to have them in Dollars? When the data were collected, 1 Euro corresponded to 1.1 Dollars. We'll therefore COMPUTE income in dollars by running compute income_dollars = income * 1.1. Now, since COMPUTE is a transformation command, it's not immediately carried out. The result is shown in the next figure, left half. However, running any procedure - even if unrelated to such “pending transformations” - will execute them. For instance, running descriptives age. executes our previous COMPUTE command.
SPSS Procedures - Practical Implications
At this point we have an idea of what's meant by “procedures” in SPSS. So why is it important to distinguish procedures from transformations and other commands? The reason is that procedures often behave differently in a number of ways:
- procedures cause all transformations to be executed;
- in contrast to transformations, procedures can't be used within a LOOP, DO REPEAT or DO IF command;
- procedures indicate the end of TEMPORARY and reverse temporary transformations;
- procedures indicate the end of any VECTOR.
- procedures delete all scratch variables.
SPSS Procedures - Processing Time
SPSS procedures involve data passes; they go through all cases in your data from top to bottom. As a consequence, their processing time depends on the number of cases in your data. If you have many (say 1,000,000 or over) cases in your data, running a procedure may take a few seconds on a modern computer. This does not hold for transformations and other commands.
If you work on a huge dataset and processing time is a real issue, minimizing the number of procedures will make your syntax run faster. For example descriptives age income. runs faster than two separate DESCRIPTIVES commands for these two variables. Finally, removing all unnecessary EXECUTE commands is another great way to speed things up.