Crosstabulation

Crosstabulation is used to display the common distribution of two variables. In addition, tests of significance and measures of assocation may be requested.

Simple example

CRO var17 by var203.

More complex example

CRO var17 by var29 by var16
  / CELL count col sresid
  / STAT btau.

In a crosstabulation, the values of one of the variables are displayed in the columns of the table and those of the other variable will be displayed in the rows. The cells that are formed by the intersection of columns and rows will display the number of cases that have both the value in the respective column and that in the respective row. (Additional cell content may be requested, see below). The column variable is the variable that is mentioned after the keyword BY. If more than one BY keyword is used, there will be several layers; e.g., if there is one additional BY keyword, for each value of the third variable (the one that is mentioned last) there will be a crosstabulation of the first two variables.


Cell content

By default, SPSS will display only the absolute numbers of the values in the common distribution. This default will be overridden by the keyword CELL, together with additional keywords that indicate the additional cell contents to be displayed. Frequently, percentages are required to assess whether there is an association between the two variables. If one of the two variables can be considered as the "independent" variable, this variable usually (i.e. if space permits) should be displayed in the columns of the table. In this case, column percentages will be used (keyword COL).

Note: As the subcommand /CELL overrides the default cell content displaying the absolute number of cases, you have to indicate "count&quot explicitly in the subcommand if the absolute number of cases in each cell are to be displayed.

Additional cell content that may be displayed (together with CELL keyword):

Keyword Display
count absolute number of cases
row row percentages
col column percentages
tot total percentages
exp expected Values
res residuals
sres standardized residuals
asres adjusted Standardized residuals

Statistics

The following statistics may be obtained. Note: If you you are interested in the statistics alone, you may suppress the table by adding /format notable or just /format not

Keyword Display
chisq Pearson Chi², Likelihood Ratio Chi²
phi Phi statistic (for categorical variables, 2 x 2 tables) and Cramer's V (larger tables)
CC Contingency coefficient(for categorical variables, larger tables)
lambda Lambda statistic and Goodman and Kruskal's tau (for categorical variables)
btau Kendall's tau-b (for ordinal variables)
ctau Kendall's tau-c (for ordinal variables)
d Somers' D (for ordinal variables)
gamma Goodman and Kruskal's gamma (for ordinal variables)
corr Pearson's R (for metric variables) and Spearman's Rho (for ordinal data)
eta Measure of association if the independent variable is categorical and the dependent variable is metric

Display: Newer versions of SPSS (i.e. version 8 or later, perhaps even version 7 - I don't remember exactly) use so called "pivot tables". These tables, to put it bluntly, are a mess. First, the labeling of the output is monstrous if you use percentages. Second, in some versions you cannot cut and paste (or save and retrieve) these tables into your word processor and then proceed as you like; rather, if (!) you succeed in pasting them into a word processor file (some versions of SPSS are quite bad at this), they cannot be changed there, since they are not treated as text, but rather as a graphical object. Note that this latter criticism does not apply to the more recent versions, in which tables copied to the clipboard may appear as tables in a word processor (I say "may" as I could not test all word processors; however, it works with Star Office and also with another very common, even though very annoying, word processor).

© W. Ludwig-Mayerhofer, IGSW | Last update: 05 Jan 2010