Internet Guide to Stata |
Print article |
Crosstabulation is used to display the common distribution of two variables. In addition, tests of significance and measures of assocation may be requested.
tab var17 var18
will display a crosstabulation with counts only.
tab var17 var18, col
will display column percentages in addition to counts.
tab var17 var18, row nofreq gamma
will display row percentages, but no counts. In addition, Goodman and Kruskal's gamma together with its ASE will be displayed.
Other options to be added after the colon include:
Note that for estimation of Kendall's tau-b, there is also a special procedure, ktau, about which you can find more in the entry on correlations.
Note also the following options that refer to the display and/or output of the table:
For higher-dimensional crosstabulations the by prefix may be used.
Alternatively you may use the table command, but this way you can obtain only frequency counts (and summary statistics, see entry on summarize) but no percentages. A three-dimensional table would look like this:
table education gender country
Tables with even more dimensions can be created using the by option, as in:
table education gender age, by(country)
Up to four variables may be included via by. Alternatively, the by prefix may be used. Finally, with the help of foreach (not covered in this guide) a table can be repeated for a number of conditions.
Instead of tab we may use tab2. With this command, more than two variables can be specified.
tab2 up85 up8601 up8602 up8603, row col taub
will produce all possible crosstabulations between the variables mentioned. Note the following useful option:
tab2 up85 up8601 up8602 up8603, firstonly row col taub
Here, all crosstabulations of up85 with the remaining variables will be displayed, with up85 as the row variable.
© W. Ludwig-Mayerhofer, Stata Guide | Last update: 23 Dec 2011