Internet Guide to Stata
Print article

Recode variables: Command "recode"

If you wish to change the categories of a variable, you may employ the command recode. Normally, the changed variable is not supposed to replace the original variable; rather, you will add the changed variable to the data set under a different name.

Example:

recode industry (1 2 = 1) (3 4 5 = 2) (6/8 = 3), gen(industry_3)

Here, 6/8 means "6 through 8"; the boundaries (i.e. 6 and 8) are included. What follows after the comma causes Stata store the result in variable "industry_3". If you are sure that you want to keep the original name of the variable with the changed values, you may omit the gen option; but the original values of the variable will be lost in this case.

You can label the new values in the process of recoding. It goes like this:

recode industry (1 2 = 1 "Primary sector") (3 4 5 = 2 "Secondary sector") ///
   (6/8 = 3 "Tertiary sector"), gen(industry_3)

The double inverted commas around the labels are necessary only if a label contains blanks; but I can't see no reason why you should not always use them.

Note that you can list several variables after recode if these are to be recoded in the same way. You may also list several variables in the parentheses following gen; of course, their number should correspond to that of the first variable list.

© W. Ludwig-Mayerhofer, Stata Guide | Last update: 21 Aug 2012