Labeling Variables and Values

Variable labels

label variable income "Gross income in 2008, in Euro"

This command, which may be abbreviated as la var, has to be repeated for each variable that is to be labeled.


Value labels

Basics

Giving labels to values works like this: You first have to define one or several labels; in a second step the label(s) is or are attached to one or several variables. Therefore, two command lines are necessary

label define mstatus 0 "unmarried" 1 "married"
label value status mstatus

Note that "status" refers to the name of the variable and "mstatus" to the name of the label (both names may be identical, by the way). The advantage of this two-step procedure is that often several variables have the same values with the same "meanings" (for instance, in the case of Likert-scaled items), and this can be made explicit by attaching the same label to these variables. To return to our example, there may be a list of up to 10 household members, and for each member there is a variable indicating whether s/he is married or not. You still will define one label as in the example above and attach it to variables, say, status1 to status 10:

label value status1-status10 mstatus

label define can be abbreviated as la de and label value as la val.

Choice of value labels is not easy, as often only a small number of characters will be displayed. I advise my students to create labels that convey significant information with the first 8 characters or so, otherwise labels may become indistinguishable in the output of some procedures.

Information about value labels that exist in your dataset

label list mstatus

or

labelbook mstatus

will display a table showing the correspondence between values and labels of the variable "mstatus"; labelbook will present some additional information and will underline the first 12 characters of each label which helps you to judge whether the labels will be unique in a typical piece of Stata output. Note that "mstatus" is the name of the label, not of the variable. If you don't remember name of the label attached to a variable, you can find it with the help of the describe or the codebook command (just insert the variable name after the respective command). As of Stata version 12, value labels are also shown in the "Variables" section of the Properties window.

Modifying existing value labels

Existing labels can be modified with the help of options. The most important options are:

label define mstatus 2 "divorced" 3 "widowed", add

add can be used to label values that have no label attached

label define mstatus 0 "cohabiting" 2 "divorced" 3 "widowed", modify

modify has to be used if existing labels are to be changed. It includes add; in other words, you can modify existing labels and at the same time add new ones.

Dropping value labels

A value label attached to a particular variable can be removed with the help of the dot. Look closely at the end of the following example – it terminates with a dot (and not a stain of dirt on your screen):

label values name-of-variable[s] .

I have deviated from the usual practice in this guide to use an example variable name to make it clear that you have to name the variable(s) from which the labels are to be removed and not the labels themselves.

On the other hand, you may remove a variable label from the data set with the command

label drop values name-of-label

© W. Ludwig-Mayerhofer, Stata Guide | Last update: 02 Aug 2015