Internet Guide to Stata
Print article

Crosstabulation

Crosstabulation is used to display the common distribution of two variables. In addition, tests of significance and measures of assocation may be requested.

Two variables

tab var17 var18

will display a crosstabulation with counts only.

tab var17 var18, col

will display column percentages in addition to counts.

tab var17 var18, row nofreq gamma

will display row percentages, but no counts. In addition, Goodman and Kruskal's gamma together with its ASE will be displayed.

Other options to be added after the colon include:

Note that for estimation of Kendall's tau-b, there is also a special procedure, ktau, about which you can find more in the entry on correlations.

Note also the following options that refer to the display and/or output of the table:

Tables with more than two dimensions

For higher-dimensional crosstabulations the by prefix may be used.

Alternatively you may use the table command, but this way you can obtain only frequency counts (and summary statistics, see entry on summarize) but no percentages. A three-dimensional table would look like this:

table education gender country

Tables with even more dimensions can be created using the by option, as in:

table education gender age, by(country)

Up to four variables may be included via by. Alternatively, the by prefix may be used. Finally, with the help of foreach (not covered in this guide) a table can be repeated for a number of conditions.

Tables with two dimensions for more than two variables

Instead of tab we may use tab2. With this command, more than two variables can be specified.

tab2 up85 up8601 up8602 up8603, row col taub

will produce all possible crosstabulations between the variables mentioned. Note the following useful option:

tab2 up85 up8601 up8602 up8603, firstonly row col taub

Here, all crosstabulations of up85 with the remaining variables will be displayed, with up85 as the row variable.

© W. Ludwig-Mayerhofer, Stata Guide | Last update: 23 Dec 2011