Internet Guide to Stata |
Print article |
All items in this section refer to options; in other word, they will follow a graph command and have to be separated from that command by a comma. Some even are sub-options; these follow options (typically in parentheses) and are separated from the ('main') option by a further comma.
Most of what I describe here is for twoway graphs only. For instance, it should be clear the options that refer to the x and y axes are meaningful only if there are indeed two axes, as is the case in twoway graphs. I hope you can discern which options (and commands, like graph export) are meaningful for other graphs as well.
Many aspects of how graphs look, such as colours, may be influenced by selection of a "scheme". The default scheme Stata uses is called s2mono, and if you are content with the results, just leave things like they are. If you want to change specific aspects like, for instance, the number of labels on the y axis of a scattergram, you may use the pertaining option, and very often this is precisely what you will want to do.
Schemes, in contrast, change a large number of things at a single stroke. So it might be worth trying out a number of schemes whether the suit your needs (or your taste). For instance, in text printed black-and-white you may wish to avoid colours; so you might wish to try one of Stata's monochrome options such as s1mono or s2mono. Here's an example.
graph twoway investments GDP, scheme(s1mono)
If you are sure that you wish to use a different scheme from the one provided by Stata as your default scheme, you may also set your scheme, either for your current session, as in
set scheme s1mono
or permanently, as in
set scheme s1mono, perm
More information about schemes can be obtained via help schemes; information on which schemes are available on your compute may be obtained by typing graph query, schemes
Even if using a scheme may be an important step towards finalizing your graph, often you will still have to change some specific aspects of the display. So, whether you are using Stata's default scheme (by way of implication) or one of the other schemes, you should know what specific options are provided to influence graphs. In the following, a quite incomplete selection is offered.
Typically, a graph is more wide than high. In certain circumstances, you may wish to change this. This can be achieved with the aspectratio option. A ratio larger than 1 means produces a graph that is more tall than wide, as in
aspectratio(1.5)
In contrast, a ratio smaller than 1 results in a graph that is more wide than tall.
A graph is the entire "display unit" in which one or several aspects of your data are shown. A graph can consist of one or more plots. Typically the plot(s), e.g., a scatterplot, is surrounded by a margin; this margin is the graph region. The plot itself is shown in the plot region, of course. Note that in fact there is an inner and an outer plot region; the same goes for the graph region. For instance, the inner plot region of the scatterplot is delimited by the x and y axes; the labels are shown at the limits of the outer plot region.
Stata graphs by default have a white (inner) plot region, whereas the graph region (and often the outer plot region) is in light blue. You may "annihilate" this difference by using option plotregion(style(none)). But all in all, graph and plot regions are a somewhat complex story. Note also that most schemes change the defaults for the graph region; as a consequence, it often is not necessary to deal with the graph and plot regions explicitly.
A title (to be displayed on top of the graph) can be requested with
title("Blood pressure in professors after ten years of service")
A title may extend over several lines; for each line, you will use a new set of quotation marks, as in:
title("Blood pressure in professors" "after ten years of service")
A subtitle may be given as well; just use subtitle in addition to title.
Finally, the axes my be titled as in
xtitle("Age at measurement")
ytitle("Systolic blood pressure RR")
If you don't give titles, the variable labels will by displayed (if they have been defined, of course). To suppress axis titles, use xtitle("") and/or ytitle("")
The size of titles is changed via the size sub-option. For instance,
xtitle("Time of Measurement", size(medlarge))
will produce a slightly larger font than the default.
help textsizestyle will introduce you to the manifold possibilities of manipulating the size of the title.
To change Stata's default display of a legend, you can provide labels to be displayed in the legend as follows:
legend( label (1 "Men") label (2 "Women"))
You can also change the order in which labels are displayed:
legend(order (2 1) label (1 "Men") label (2 "Women"))
Finally, a legend you deem unnecessary may be suppressed with the following option:
legend(off)
A note will typically be displayed in small letters at the left bottom of the graph:
note("Özgul-Harnischfeger data 2004, own calculations")
Prologue: Note that there is a difference between twoway graphs (the most common variety) and univariate (or oneway) graphs. Whereas the former have an x and a y axis, the latter have only a y axis, even though they may sometimes "look like" twoway graphs. For instance, a box-and-whisker plot showing the distribution of a variable for several groups is still considered a univariate graph; hence you cannot use options that refer to the x axis. The final subsection of this section gives some hints that refer to this case.
You can determine the range of the axes via "xsc" and "ysc". Note that you cannot restrict display of values to a smaller set of values than are present in the data; all you can do is to expand the axes beyond the smallest and / or largest values.
xsc(r(0 1)) ysc(r(0 50))
will set the minimum of both axes to 0, the maximum of the x axis to 1 and the maximum of the y axis to 50. If, say, the minimum value of y is 0, you may omit this value in "ysc", mentioning only the upper value of 50 within the parenthesis.
You can influence which values are displayed (and ticked) on each axis. For instance, if the x axis ranges from 0 to 10,000, you may wish to display values at 0, 2000, 4000 and so forth. The command to achieve this is:
xlabel(0(2000)10000)
The same rules apply to the "ylabel" command.
The values on the y axis by default are displayed vertically. If you wish to display them horizontally, you have to add a (sub-)option to the ylabel command. If this sub-option is used alone, the command will look like this:
ylabel(,angle(0))
In combination with a definition of the values to be labeled it would look like this:
ylabel(0(2000)10000, angle(0))
If the values of a variable are labeled, Stata graphs typically will display these labels as well. If you wish different labels to be used in the graph, they may be changed on the spot:
xlabel(1 "Low income" 2 "Intermediate income" 3 "High income")
Normally, labels are displayed side by side on the x axis. This may sometimes cause trouble if there are many labels or labels are long. You may use the alternate (or alt) option which will produe an offset between labels:
xlabel(1 "Low income" 2 "Intermediate income" 3 "High income", alt)
Often, the default display of a Stata graph includes grid lines, typically on the y axis. If the graph you wish to produce does not contain grid lines by default, this can be amend via the grid option, as in:
ylabel(0(2000)10000, axis(0) grid)
Conversely, if you wish to suppress grid lines that are dislayed by default, you may use the nogrid option.
The size of the axis labels can be changed with the labsize sub-option within the xlabel or the ylabel option. So,
ylabel(0(2000)10000, axis(0) labsize(medlarge))
will render the labels somewhat larger.
The information available via help textsizestyle, which was introduced above, is also valid for the size of labels.
A typical case of a plot with only an apparent x axis is a boxplot showing the distributions of a number of groups. As is shown elsewhere, such a plot is produced using the over option, as in
graph box income, over(status)
Therefore, any options that refer to the labeling of the pseudo-x-axis have to be included as sub-options to the over option. For instance, to display the values (or labels) that denote the different groups in a larger font, you may include the label sub-option::
graph box income, over(status, label(labsize(medlarge))
In a line chart, you may distinguish different lines by color or pattern. You may also wish to change the thickness of the line.
The pattern of the line can be changed via option lpattern, such as in
line sales1 sales2 year, lpattern(solid dash)
with the first line being drawn as a solid line and the second as a dashed line. Other pattern styles are dot, dash_dot, shortdash, shortdash_dot, longdash, longdash_dot or blank. The last option is just in case you wish to draw an invisible line for some reason or other. You may also create your own line pattern with the help of a "formula", such as in
line sales1 sales2 year, lpattern("_-." "__#")
You may combine elements _ (underscore = long dash), - (hyphen = medium dash), . (dot = short dash) and # (= small amount of blank space). "l" (small l) will draw a solid line.
A horizontal line at a given value of x, say, 1.5, may be added with option
yline(1.5)
Several numbers may be enclosed within the parentheses, producing several lines.
A vertical line at a given value of x, say, 3, not suprisingly is added with option
xline(3)
Again, several values may be enclosed within the parentheses, producing several lines.
Often, you may add an option referring to colours. In the case of histograms and bar charts, this is called bcolor, with an example being bcolor(blue). Details are beyond the scope of this guide, at least for the time being.
If several graphs are created in one session, it may be helpful to give names to them (e.g., to refer to them later on). This is accomplished with the help of the "name" option.
kdensity bk_nk_w , name(NK_W, replace)
The replace option within the name option prevents Stata from stopping if you already created this graph and now want to create a modified version. As it will do no harm if the graph is not already present, you might as well use this option from the beginning, not having to think about it later on. (Of course, this holds true only if you're sure you don't want to keep the old file!)
Names are also needed to save graphs in a format that can be further processed by other software.
graph export NK_W.wmf, replace
will save the graph in Windows Metafile format (if you are working under Windows). Other formats are available; the best way to create a graph in a certain format is to give the file the appropriate extension.
graph export NK_W.ps | Postscript |
graph export NK_W.eps | Encapsulated Postscript |
graph export NK_W.emf | Windows Enhanced Metafile |
graph export NK_W.pict | Macintosh Pict format |
graph export NK_W.pdf | PDF format (prior to version 12 available on Macintosh only) |
graph export NK_W.png | PNG (Portable Network Graphics) |
graph export NK_W.tif | TIFF format |
© W. Ludwig-Mayerhofer, Stata Guide | Last update: 30 Jun 2014