Cleveland Dot Plots

A Cleveland dot plot looks like a bivariate plot, and is sometimes used as such (see the example at the end), but in its typical usage it is better considered (in my opinion) as a univariate plot where the different data values, represented by dots, carry identifying labels, even though the data values may be summary statistics such as the mean or percentiles (see Jacoby 2006). A very different type of dot plot is shown in a later entry. Stata-wise it comes down to this: Cleveland dot plots are what you get with graph dot, whereas the other type of dot plot is created by dotplot.

The terminology concerning dot plots or dot charts is not firmly established, occasional claims to the contrary notwithstanding. Both the plots described in this entry as well those created by dotplot, plus sometimes others, can be found under the headings of "dot plots" or "dot charts". However, the term "Cleveland dot plot" (after statistician and data visualisation expert William S. Cleveland [1983]) has been coined to highlight this particular type of chart, and I follow this usage. The Stata User's Guide does not use this term and it conceives of graph dot as a device for the display of summary statistics such as means or percentages.

Please note: Most graphs in this entry have been created by using, among other options, scheme(s1mono) and plotregion(lstyle(none)). These are not repeated in the examples shown below.


Here is an example of the basic Cleveland dot plot, with single data values. I use data from the OECD on labour force participation of women aged 25 to 49 in 2007 (the data can be found in an introduction to Statistics I wrote in German with two colleagues, Ludwig-Mayerhofer et al. 2014, on p. 51). In this example, the data are ordered from the highest to the lowest by way of suboption (to option "over") , sort(1) descending (with a country with a missing value on the bottom), but they might also be ordered the other way round, or alphabetically.

graph dot lfp07_age2549, over(country, sort(1) descending label(labsize(*.8))) ytitle(" " "Female labour force participation") ylabel(55 (5) 80) exclude0

Cleveland dot plot

Option exclude0 ensures, together with ylabel(55 (5) 80) that what Stata calls the y axis starts at 55. Calling this the y axis is entirely appropriate, as technically speaking what we see is a bivariate plot. And of course we may use this graph to think about which "country factors" might actually influence female labour force participation (we see that the Scandinavian countries are on the top, and the Southern European countries [and Japan] on the bottom). Still, I would say that bascially this is a list of labeled data values (in this example, but not necessarily, ordered by size).

Note that graph dot actually computes and displays the mean of the variable under investigation (labour force participation) for each country. But in this example, countries are cases; i.e., there is only a single row of data for each country, and therefore the "mean" is identical with that single value. Of course, the same graph might have been obtained from a data set with appropriately coded individual level data (i.e. data with several hundred or thousands of cases from each country)

graph dot can also be used to display several variables instead of several cases; here is an example with just two variables from a survey, the number of people in the respondents' households and the number of rooms of the apartments the respondents were living in:

graph dot hhsize nrooms, ascategory ylabel(,angle(0))

where option ascategory ensures that the two values are shown on two different lines.

Cleveland dot plot showing means of two variables

A further possibility is to display several values from the distribution of a variable. I combine this feat with using two grouping variables, education and sex/gender, and request the median and the first and the last decile of equivalent income for each group. In other words, here we have a trivariate graph.

graph dot (p10) equivinc (p50) equivinc (p90) equivinc , over(edu) over(sex) legend( label (1 "p10") label (2 "Median") label (3 "p90") )

Cleveland dot plot showing means and deciles over groups

Top of page

Reference

  • Cleveland, William S. (1983): Visualizing Data, Hobart Press.
  • Jacoby, William G. (2006): The Dot Plot: A Graphical Display for Labeled Quantitative Values, The Political Methodologist. Newsletter of the Political Methodology Section, American Political Science Association, Vol. 14, Number 1, pp. 6-14.
  • Ludwig-Mayerhofer, Wolfgang/Liebeskind, Uta/Geißler, Ferdinand (2014): Statistik. Eine Einführung für Sozialwissenschaftler, Weinheim, Basel: Beltz Juventa.

© W. Ludwig-Mayerhofer, Stata Guide | Last update: 21 Apr 2025