"I have a data file on which I'd like to carry out several regression analyses. I have four dependent variables, v1 through v4. The independent variables (v5 through v14) are the same for all analyses. How can I carry out these four analyses in an efficient way that would also work for 100 dependent variables?"
SPSS Python Syntax Example
*Run REGRESSION repeatedly over different dependent variables.
begin program.
import spss,spssaux
dependent = 'v1 to v4' # dependent variables.
spssSyntax = '' # empty Python string that we add SPSS REGRESSION commands to
depList = spssaux.VariableDict(caseless = True).expand(dependent) # create Python list of variable names
for dep in depList: # "+=" (below) concatenates SPSS REGRESSION commands to spssSyntax
spssSyntax += '''
REGRESSION
/MISSING PAIRWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT %s
/METHOD=STEPWISE v5 to v14.
'''%dep # replace "%s" in syntax by by dependent var
print spssSyntax # prints REGRESSION commands to SPSS output window
end program.
*If REGRESSION commands look good, have SPSS run them.
begin program.
spss.Submit(spssSyntax)
end program.
begin program.
import spss,spssaux
dependent = 'v1 to v4' # dependent variables.
spssSyntax = '' # empty Python string that we add SPSS REGRESSION commands to
depList = spssaux.VariableDict(caseless = True).expand(dependent) # create Python list of variable names
for dep in depList: # "+=" (below) concatenates SPSS REGRESSION commands to spssSyntax
spssSyntax += '''
REGRESSION
/MISSING PAIRWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT %s
/METHOD=STEPWISE v5 to v14.
'''%dep # replace "%s" in syntax by by dependent var
print spssSyntax # prints REGRESSION commands to SPSS output window
end program.
*If REGRESSION commands look good, have SPSS run them.
begin program.
spss.Submit(spssSyntax)
end program.
Description
- That this syntax uses Python so you need to have the SPSS Python Essentials installed in order to run it;
- The syntax will simply run a standard SPSS regression analysis analysis over different dependent variables one-by-one;
- Except for the occurrence of
%s
, Python will submit to SPSS a textbook example of regression syntax generated by the GUI. It can be modified as desired. - The TO and ALL keywords may be used for specifying the dependent and independent variables. The entire specification is enclosed in quotes.
- As a test file for this solution, you could use supermarket.sav.
THIS TUTORIAL HAS 24 COMMENTS:
By Ruben Geert van den Berg on April 18th, 2017
Thanks for your comment, Andy! Interesting idea, hadn't occurred to me at all.
However, I couldn't run the syntax because it requires the SPSS Advanced Statistics module and I don't have that on my home computer.
Do you know if the GLM approach is restricted to the ENTER method or is there some kind of STEPWISE available -as used in the tutorial- as well?
By Andy Wheeler on April 18th, 2017
GLM does not have step-wise, it isn't clear how you would do stepwise for a multivariate model. E.g. In your linear regression equations v1 could have a different end set than v4.
By Emma on July 25th, 2017
How do I add a title to each regression, so that the title is the dependent variable? (The title needs to be different for each regression.) Also, is there a way to make a table of contents, so that when I export to word, I can find the graphs easily?
By Ruben Geert van den Berg on July 26th, 2017
Hi Emma!
Sure, you could add a TITLE command to the syntax, containing
%s
. After the ending quotes for the SPSS syntax, make sure%s
is replaced by the name -or perhaps the variable label- of the dependent variable.I'm not sure how the titles will come through when exporting to .rtf/WORD but I think they'll be seen as just plain paragraph text. I don't see any way to have them exported as true title elements but I should add that I never tried either.
Hoppe that helps!
By Emma on July 26th, 2017
Thank you for your quick reply, Ruben! I tried to take your recommendation for the title, but I received the following error message:
TypeError: not enough arguments for format string
Here is what my script looks like:
begin program.
import spss,spssaux
dependent = 'NA.3rd.Ventricle to Right.Accumbens.Area'
spssSyntax = ''
depList = spssaux.VariableDict(caseless = True).expand(dependent) # create Python list of variable names
for dep in depList: # "+=" (below) concatenates SPSS REGRESSION commands to spssSyntax
spssSyntax += '''
TITLE %s
REGRESSION
/MISSING PAIRWISE
/STATISTICS COEFF COLLIN
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT %s
/METHOD=ENTER EMOT_Composite FINNBH_avg_composite
/METHOD=ENTER COG_Composite.
'''%dep # replace "%s" in syntax by dependent var
print spssSyntax
end program.
My program runs if I delete the title line. Thanks for the help!