# Welcome to my revamped SPSS Guide!

This guide is supposed to work as a brief "online help" for IBM® SPSS® for Windows® via WWW, but with little adaption, most of what is presented here will work with other versions of SPSS (or PASW, as the software was intermittently called). Its aim is to provide an intermediate road that hopefully is especially convenient for beginners – more precisely, for beginners with a basic understanding of how SPSS works and what it is good for.

Since a couple of years, I have somewhat neglected to this guide. However, while nothing new is added, the old stuff likely will stay valid for another couple of decades. Perhaps some people still find it useful. In November 2016, David Peplow has done a terrific job at creating a new design for my web projects. I am deeply grateful.

Note that his guide does not introduce you to the basics of working with SPSS for Windows, e.g., the different "windows", how to set up a data base, how to execute commands from a syntax file, etc. These are things that are very tiresome to explain in writing and very easy to explain simply by demonstrating and rehearsing (and some trial and error). But when you just have developed a basic idea of how the program works, this guide hopefully may be of some help.

Hope you enjoy it!

Wolfgang Ludwig-Mayerhofer

This guide gives only a few examples for the most common SPSS procedures. It places some emphasis on data handling and transformation, topics that are sadly neglected in most of the German books on SPSS for Windows. Statistical procedures, especially the more sophisticated ones, are treated here only superficially or (in most instances) not at all. For instance, psychologists will miss treatment of analysis of variance procedures, whereas economists will find nothing about time series analysis. Also, this guide says nothing about how to change charts interactively. Thus, this guide is in no way exhaustive and cannot function as a substitute for either the program's online help or the handbooks that are provided with the program. Users are urged to consult especially the handbooks for further detail.

Throughout, it is assumed that SPSS for Windows users work via Syntax Files. The use of menus is explicitly discouraged, with very few exceptions where the menus have some advantage over writing the syntax manually. Even in these cases, it is usually highly recommended that the command not be executed from the menu, but rather be pasted into the Syntax Window and executed from there. Those people who want to learn working with the menu system may consult some of the sources list below in the links section. Obviously, the menu is helpful for an occasional quick glance at the data, but it should not be at the core of your work.

Here are my reasons why working with syntax files should be preferred over using the menu system:

1. Many features are available only via syntax files. You never will become an advanced user of SPSS if you stick to the menus only.
2. Even though many people think otherwise (because programmers make them believe so): Working with syntax files often is much easier. Consider the following example: To recode a variable that has 4 values into a new variable with 2 values only and to label the values of this new variable takes two (2) short lines of code in a syntax file (well – if the code is to be written very clearly, I would like to spread it over 5 lines, but 'logically', it's 2 lines). However, an SPSS for Windows Guide at New York University (not longer available for anybody, as far as I can see) gives thirteen (13) steps that are required for the same procedure using the menu system, and actually many of these steps consist of several smaller steps (such as: "enter value x here, value y there, and click the OK button").
Admittedly, a few things also work more easily via menus; most notably, matching files when you want to drop many variables. I may mention that these are things that used to be very simple in former versions of SPSS and have been made – unnecessarily, in my opinion – more complicated when SPSS entered the MS-WINDOWS age.
3. A decisive reason concerns the style of scientific work. Usually, data analysis is a task that develops over time, taking you many steps forwards, backwards, sideways, etc. In order for your work not to become publishable in the Journal of Irreproducible Results only, you have to be able to demonstrate, to yourself and to your fellow researchers, what you have done. How will you be able later to prove that you correctly recoded a variable (i.e., that you did it the way you think you did it) when all you can say is 'I clicked here and there, and I believe I did it the right way', but you have no hard copy proof to back up your memory?
4. Many people are involved in doing the same or similar analyses repeatedly (for instance, analyzing budget data every year, or quarterly, or monthly). In such circumstances, syntax files save you even more time. (Thanks to Hillel Vardi for pointing this out to me.)
5. A perhaps minor point, but still of some importance, is that the GUI (=graphic user interface) of SPSS for Windows seems to change from time to time. In addition, there are many other platforms on which SPSS runs. In contrast, the command syntax has remained surprisingly constant for decades. That is, you can very easily adapt to newer versions if you are accustomed to working via syntax files (even though this may make you reluctant to explore all the fabulous features that are added in each new version) and you can change between platforms very quickly. I have learned working with SPSS in 1975 via punchcards (young people, don't be desperate if you don't know what this means – there is something in history we may call progress, at least occasionally), but I can still use most of what I learned then.
6. A related point is that some of the more advanced statistical packages, such as S-Plus™, Stata™, SAS™, LIMDEP™, GAUSS™, or TDA, rely heavily on command language that is very similar to that of SPSS (well, TDA's is a little bit awkward, but still . . .). While it may be argued that most users never will have to deal with such packages, I would like to argue against a divide between 'simple' data analysis tasks that may be done by 'anybody' and require no sophistication and 'more complex' tasks that can be managed only by advanced users. While in practice such a divide does exist, this is a deplorable state that should be changed and should not serve as a justification for superficial work.
7. Finally, even though perhaps not too important: The SPSS syntax language, which is based on the English language, is the same in every country. However, the menu system is adapted to country specific languages. Therefore, in this guide I would have to explain that German users, e.g., should click on 'Datei' where English users have to click on 'File', and so on.

One thing that is indeed impossible via syntax files is setting up a data set and entering data. However, users often work with existing data sets anyway. Those who have to enter their own data are asked to consult any book on SPSS that covers that topic (virtually all do). Some of the sources at listed below will be of help as well.

Finally, a further note on different versions of SPSS. I started this guide when I was working with version 6 of SPSS. As far as I could check, all of the examples I provide should work with SPSS for Windows, versions 6 through 15. In the meantime, version 18 is available which means that SPSS has changed quite a lot. New procedures have been added, the output differs slightly in content in different versions of SPSS and it differs enormously in form, at least if contrasted to version 6 with which I started (in this case one may doubt, in my view, whether there has been progress). Some of the procedures that are described now may not have been available in earlier versions. As dealing with the peculiarities of SPSS or writing this Guide is not at the core of my work, I usually can give only rough indications about changes or about when a certain procedure or feature was introduced. Most people working with SPSS today most likely can use anything they find here, but occasionally I may have missed changes made to procedures with the consequence that this Guide is not quite up-to-date. Likewise, people who for whatever reason may have (or wish) to resort to an earlier version may encounter stuff here that will not work for them, as the feature described here is not yet contained in the version they are using. I am pretty confident that this happens only infrequently, but anyway I wish to apologize for any inconvenience you may encounter.

## History

November 2016

Re-design.

October 2016

Nothing has happened in the past years. But hopefully the old stuff is still helpful for some people.

May 2012

Slightly enlarged the entry about basic charts by adding an example for a scatterplot matrix.

May 2010

Slightly enlarged the entry about (nonparametric) survival analysis.

February 2010

Added an entry on the SHIFT VALUES command, introduced in version 17.0.

December 2009

Added an entry on non-parametric tests. Slightly enlarged the entry on Pearson's r and on the t-test.

September 2009

Added a small entry on defining the level of measurement of variables. Rearranged and augmented the sections on reading and saving data. Added a (small) entry on SPSS settings.

June 2009

Finally augmented the GET FILE entry to account for the possibility of having open several data files. Included an entry on MULTIPLE IMPUTATION.

June 2007

Changed and augmented a little bit the entry on multinomial logistic regression. Added a warning to the GET FILE entry, as of version 14.0 file SPSS offers new possibilities for handling data files which are not yet covered here.

January 2007

October 2006

Slightly augmented the entry about basic charts.

April 2006

Added an entry on computing the intraclass correlation coefficient (section data analysis – data reduction).

November 2005

Added a short entry about variable names in the starting section of this guide (at #6).

December 2004

Slightly enlarged the entry on VARIABLE DISPLAY (previously VARIABLE FORMAT).

September 2004

Made some amendments to keyword COMPUTE and added a section about the LAG operator which may be used together with COMPUTE (or with IF).

March 2003

Added an entry on estimating multilevel (random coefficient) models with procedure MIXED.

July 2002

Added items on the PLUM procedure, requesting logistic regression (and related) models for dependent variables with several ordered categories, and NOMREG, to estimate multinomial logistic regression models.

June 2002

Added a special item on bar charts in section "data analysis". This is special inasmuch a couple of bar charts are displayed in order to make more clear what the output of different options may be.

May 2002

Having included the "Basics" section (see April 2002), I have noticed that during all the years this guide has been on the web I have violated a rule I constantly preach to my students, viz., the rule to indent all lines save the first one in commands extending over more than one line. This has been fixed now. Also, a couple of minor errors have been eliminated.

April 2002

Included a new section "Basics" that explains some features of the syntax as well as of dealing with data files and the program output. Also included were some new items in the "Handling Data Files" section: Get File, Save File, Rename Variables and Variable Format. Finally, the entry Cox Regression has been enlarged to include a short discussion of "stratified" analysis.

Before April 2002

Everything else.

Note that minor amendments (such as corrections of misspellings) are not listed here.