Collapsing your data means to combine several cases into single lines. This is much liking creating statistics for groups of cases, but by collapsing your data a new data set is created that contains these statistics and can be put to further use.

By default, the mean of one (or several) variables is created. So, the simplest version of the command goes like this:

mean income, by(occupation)

The new data set will contain one row for each occupation, and the variable "income" will give the mean of income of each occupation. See help collapse to find out more about other options.

Note that you do not have to collapse data if you just want to add the mean of variable (possibly for subgroups) to your current dataset. Rather, use the egen command described in the section about generate/replace.


Contract creates a new dataset consisting of all combinations of a number of variables plus a new variable that represents the frequency of each combination.

contract occupation gender

will create a dataset that contains all occupation-gender combinations in your original data and the frequency with which each combination occurs. Note that by default missing values are treated as a value in its own right, but this, just as a number of other features, can be changed with the help of options. For further information see help contract.

© W. Ludwig-Mayerhofer, Stata Guide | Last update: 16 Jun 2015