Multilevel Modeling

Prefatory note 1: The commands xtmixed, xtmelogit etc. that were used for estimation of multilevel models in Stata up to version 12 have been replaced by mixed, melogit and so on as of version 13. However, the older commands as yet are still available (this statement currently includes version 14). Basically, the older commands beginning with xt and the newer versions are very similar; if anything does not work, please refer to the Stata help system or the handbook. For models with metric dependent variables, I will present both the xtmixed and the mixed commands; for other models (to be presented further below) I will use the new commands only.

Prefatory note 2: Multilevel models can also be estimated with gllamm, and therefore I will present a few examples that refer to this procedure. In examples that follow I always add the adapt option (for adaptive quadrature). This way, results will be very close to those of (xt)mixed.

Throughout, I will provide only the minimum of commands necessary to make things run. The numerous options available are not discussed here, at least for the time being, with a few exceptions. Note that with xtmixed or mixed , you can also use factor variables.

The data set I have in mind is a subsample from the NELS-88 study; is is used in the introductory book by Ita Kreft and Jan de Leeuw (Sage, 1998). MATH (scores obtained in mathematics) is the dependent variable; SCHID is the identifier for schools (level 2); HOMEWORK is the amount of homework in hours; and PUBLIC is a dummy variable for public school. Of course, more independent variables may be introduced. Other examples (particularly for categorical dependent variables) are completely made up, but still use (by way of fiction) the variables I have just described.

Models for a metric dependent variable

Basic model (estimation of variances only)

xtmixed MATH || SCHID:, variance

mixed MATH || SCHID:, variance

Up to and including Stata 11, xtmixed used REML (restricted Maximum Likelihood) estimation by default. In version 12, and in the mixed command, this has changed to standard ML estimation. Whatever the default, you may request standard ML with option mle and REMLS with option reml. The examples use the option variance, which requests Stata to deliver variances on the first and second level instead of standard deviations.

In glamm, it works like this:

gllamm MATH, i(SCHID) adapt

If you don't want to compute the percentage of variance on both levels by hand, you may use the xtreg procedure:

xtreg MATH, re i(SCHID)

However, the estimates differ slightly from those of xtmixed.

An individual level covariate, plus random intercept

xtmixed MATH HOMEWORK || SCHID:

mixed MATH HOMEWORK || SCHID:

gllamm MATH HOMEWORK, i(SCHID) adapt

Random intercept and random slope

xtmixed MATH HOMEWORK || SCHID: HOMEWORK, cov(unstruct)

mixed MATH HOMEWORK || SCHID: HOMEWORK, cov(unstruct)

With gllamm, things become a bit complicated now. We have to create a constant, and we have to assign the constant and the covariate(s). So, all in all it looks like this:

gen cons=1
eq cons:cons
eq slope: HOMEWORK
gllamm MATH HOMEWORK, i(SCHID) nrf(2) eqs(cons slope) adapt

With, say, two individual level variables with random slopes, you may define, e.g., slope1 and slope2. In addition, you will write nrf(3) eqs(cons slope1 slope2). Note that already such a relatively simple model will require very much patience on your part with gllamm, particularly if option adapt is used.

Random intercept, random slope plus a higher-level covariate

xtmixed MATH HOMEWORK PUBLIC || SCHID: HOMEWORK, cov(unstruct)

mixed MATH HOMEWORK PUBLIC || SCHID: HOMEWORK, cov(unstruct)

gllamm MATH HOMEWORK PUBLIC, i(SCHID) nrf(2) eqs(cons slope) adapt

Random intercept, random slope, higher-level covariate plus cross-level interaction

xtmixed MATH HOMEWORK PUBLIC c.HOMEWORK#c.PUBLIC|| SCHID: HOMEWORK, cov(unstruct)

The cross-level interaction is in the expression c.HOMEWORK#c.PUBLIC (i.e. a factor variable; since PUBLIC is a 0-1 coded dummy variable, we may treat it like a continuous variable). As gllamm cannot deal with factor variables, you will create the cross-level interaction as follows:

gen HOMEPUB = HOMEWORK * PUBLIC
gllamm MATH HOMEWORK PUBLIC HOMEPUB, i(SCHID) nrf(2) eqs(cons slope) adapt

Models for categorical dependent variables (short overview)

Basically, the models look very similar to those explained above. But unsurprisingly, some options are absent and other are available. I will present only the commands for simple models with one dependent and one independent variable.

Binary variables

Here, a couple of models are available which should yield similar results:

melogit DROPOUT HOMEWORK || SCHID:

meprobit DROPOUT HOMEWORK || SCHID:

mecloglog DROPOUT HOMEWORK || SCHID:

Ordinal variables

meologit GRADE HOMEWORK || SCHID:

meoprobit GRADE HOMEWORK || SCHID:

Count data

mepoisson AWOL HOMEWORK || SCHID:

menbreg AWOL HOMEWORK || SCHID:

Multinomial dependent variable

These models cannot be estimated with me commands. However, the gsem command may be used which currently is beyond the scope of this guide.