Display and Output Format of Variables
How a variable appears when you look at the data with browse
or edit
is defined by the variable's "format". Moreover, the format can influence the output of statistical procedures. For instance, it may happen that the means of a variable for the groups defined by an analysis of variance seemingly have no decimal values, which normally is very unlikely. This is the result of a less than fortunate definition of the variables' format.
Formats for numeric variables
The format of a variable is displayed when you describe the variable. Alternatively, you may just write
format v17
to obtain the format of, e.g., v17. There are two basic formats for numeric variables, which will be described shortly; other, very special formats (exponential, binary, hexadecimal) are not explained here.
The general format
In this format, the overall width of a variable in the data window is indicated. The number of decimal places may or may not be indicated, but decimal values will be displayed nevertheless if they are present in the data. Defining the number of decimal values means defining the maximum number of decimal values displayed, whereas not defining the number of decimal values will make Stata display as many decimal places as are present, within the limits of the overall width of the variable.This format is displayed as, e.g.,
%9.0g
which indicates that the overall width of the display is 9 characters wide. The number 0 refers to the decimal places, but in this format it just means that all decimals are displayed, as long as the overall width permits their display. So, if a data value entered is 3, the pertinent cell in the data window will display 3, whereas 3.22 will be displayed as 3.22.
To change the format, just write, e.g.,
format v17 %6.0g
or, alternatively
format %6.0g v17
Very large or very small numbers may be displayed in the exponential format (e.g., 3.22e+6).
The fixed format
This format differs from the general format inasmuch as the number of decimal values is fixed. If the format is defined with two decimal places, 3 will be displayed as 3.00, and a value of 3.004 likewise will be displayed as 3.00. Note particularly that if no decimal places are defined, some procedures, such as oneway
, will display the groups means without decimal values, i.e., rounded to the next integer.
This format is displayed as, e.g.,
%9.2f
for a value that will accommodate 9 characters overall (including the decimal point!) and 2 decimal places.
Special tricks for the general and the fixed format
- Preceding a format with a "-" sign, as in
%-9.2f
will cause the variable to be displayed with left-alignment. - Using
%9.2fc
instead of%9.2f
or%9.0gc
instead of%9.0g
will insert commas, as in 100,000 instead of 100000. - Using
%9,2f
instead of%9.2f
will display a decimal comma instead of a decimal point; as will%9,0g
. - Just entering the command
set dp comma
has the same effect; it will affect all values in the current data set that have decimal places.set dp period
will switch back to the default display.
Formats for special types of variables
Stata offers possibilities to define string variables (variables containing characters) or date/time variables. These have their own formats and may be dealt with here later.
© W. Ludwig-Mayerhofer, Stata Guide | Last update: 26 Jul 2017