"I have a data file on which I'd like to carry out several regression analyses. I have four dependent variables, v1 through v4. The independent variables (v5 through v14) are the same for all analyses. How can I carry out these four analyses in an efficient way that would also work for 100 dependent variables?"
SPSS Python Syntax Example
*Run REGRESSION repeatedly over different dependent variables.
begin program.
import spss,spssaux
dependent = 'v1 to v4' # dependent variables.
spssSyntax = '' # empty Python string that we add SPSS REGRESSION commands to
depList = spssaux.VariableDict(caseless = True).expand(dependent) # create Python list of variable names
for dep in depList: # "+=" (below) concatenates SPSS REGRESSION commands to spssSyntax
spssSyntax += '''
REGRESSION
/MISSING PAIRWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT %s
/METHOD=STEPWISE v5 to v14.
'''%dep # replace "%s" in syntax by by dependent var
print spssSyntax # prints REGRESSION commands to SPSS output window
end program.
*If REGRESSION commands look good, have SPSS run them.
begin program.
spss.Submit(spssSyntax)
end program.
begin program.
import spss,spssaux
dependent = 'v1 to v4' # dependent variables.
spssSyntax = '' # empty Python string that we add SPSS REGRESSION commands to
depList = spssaux.VariableDict(caseless = True).expand(dependent) # create Python list of variable names
for dep in depList: # "+=" (below) concatenates SPSS REGRESSION commands to spssSyntax
spssSyntax += '''
REGRESSION
/MISSING PAIRWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT %s
/METHOD=STEPWISE v5 to v14.
'''%dep # replace "%s" in syntax by by dependent var
print spssSyntax # prints REGRESSION commands to SPSS output window
end program.
*If REGRESSION commands look good, have SPSS run them.
begin program.
spss.Submit(spssSyntax)
end program.
Description
- That this syntax uses Python so you need to have the SPSS Python Essentials installed in order to run it;
- The syntax will simply run a standard SPSS regression analysis analysis over different dependent variables one-by-one;
- Except for the occurrence of
%s
, Python will submit to SPSS a textbook example of regression syntax generated by the GUI. It can be modified as desired. - The TO and ALL keywords may be used for specifying the dependent and independent variables. The entire specification is enclosed in quotes.
- As a test file for this solution, you could use supermarket.sav.
THIS TUTORIAL HAS 22 COMMENTS:
By Donna on February 17th, 2016
Hi, Thank you so much for this, it's exactly what I'm looking for :)
I wonder do you have any information about how to include syntax for bootstrapping in this, i.e. how to carry out the bootstrapping command in each simultaneously run regression model here? This is the output code when the GUI is used:
BOOTSTRAP
/SAMPLING METHOD=SIMPLE
/VARIABLES TARGET=v1 INPUT= v4
/CRITERIA CILEVEL=95 CITYPE=PERCENTILE NSAMPLES=1000
/MISSING USERMISSING=EXCLUDE.
I assume the variables (v1, v2, v3, v4) can be replaced with something that allows this?
Any info would be greatly appreciated!
By Ruben Geert van den Berg on February 17th, 2016
Hi Donna! Indeed. You'll probably want to use a Python
for
loop similar to the one in this tutorial and either replace "v1", "v4" or both by%s
in the syntax that'll be submitted.At the end of this syntax, specify what one or more instances of
%s
should address after another%
sign. Does that do the trick for you?By Emily on April 17th, 2017
I am a little confused about why I am getting an error when I run this (it hopefully is a silly mistake as I am very new to python in SPSS). I copy and pasted the syntax above and tried to run it on the supermarket dataset. However, most of the variables in the loop were kicked out and I am getting an error that "no variables were entered into the equation". Any help would be awesome-- thanks!
By Ruben Geert van den Berg on April 18th, 2017
Hi Emily!
This is a really old tutorial so I rewrote the syntax. Take another look at it, it'll make much more sense now.
The third REGRESSION does not enter any independent variables due to the STEPWISE method: SPSS Stepwise Regression - Example 2.
You can sort of confirm this by running
correlations v5 to v14 with v3.
Only v13 correlates slightly significantly with the dependent variable.
Hope that helps!
By Andy Wheeler on April 18th, 2017
See GLM - you don't need to estimate the linear regressions separately, you can estimate them in one command. Here it would be something like "GLM v1 v2 v3 v4 WITH v5 v6 v7 v8 v9 v10 /DESIGN v5 v6 v7 v8 v9 v10 /PRINT PARAMETER."