SPSS Tutorials


Adjust String Lengths before Merging Files


"I'd like to merge a number of data files with the ADD FILES command but it doesn't work. Some similarly named string variables have different lengths over different files. Is there an easy way to automatically adjust the lengths of all string variables in order to merge my files?"

SPSS Python Syntax Example

*1. Create some test data files.

begin program.
rdir = 'D:/temp2/' # Specify empty folder for creating test files.
import random,spss
for cnt in range(1,5):
    lens = [random.randint(1,8) for i in range(4)]
    spss.Submit('data list list/v1(f1)v2(f2)%s.'%' '.join(['sv_%d(a%d)'%(i+cnt, j) for i,j in enumerate(lens)]))
begin data
%d %d%d %s
end data.
compute bla = date.dmy(1,1,2014).
formats bla(date11).
save out '%sfile_%d.sav'.
new file.
'''%(cnt,cnt,cnt,' '.join(['a'*i for i in lens]),rdir,cnt))
end program.

*2. Actual solution: adjust all string lengths.

begin program.
rdir = 'd:/temp2' # Specify folder holding files to be merged.
import spssaux,spss,os
slens = {}
for fil in [os.path.join(rdir,fil) for fil in os.listdir(rdir) if fil.endswith('.sav')]:
    spss.Submit('get file "%(fil)s".'%locals())
    for cnt in range(spss.GetVariableCount()):
        if spss.GetVariableType(cnt) != 0:
            if spss.GetVariableName(cnt) not in slens or int(spss.GetVariableFormat(cnt)[1:])>slens[spss.GetVariableName(cnt)]:
                slens[spss.GetVariableName(cnt)] = int(spss.GetVariableFormat(cnt)[1:])
for fil in [os.path.join(rdir,fil) for fil in os.listdir(rdir) if fil.endswith('.sav')]:
    spss.Submit('get file "%(fil)s".'%locals())
    spss.Submit('alter type %s.'%''.join([var+'(a%d)'%leng for var,leng in slens.iteritems()if var in spssaux.GetVariableNamesList()]))
    spss.Submit('save out "%(fil)s".'%locals())
spss.Submit('new file.')
end program.



Previous tutorial: Add Filenames to Files Before Merging

Next tutorial: Merge Many Data Files

Let me know what you think!

*Required field. Your comment will show up after approval from a moderator.

This tutorial has 8 comments