Multilevel Modeling
Prefatory note 1: The commands xtmixed
, xtmelogit
etc. that were used for estimation of multilevel models in Stata up to version 12 have been replaced by mixed
, melogit
and so on as of version 13. However, the older commands as yet are still available (this statement currently includes version 14). Basically, the older commands beginning with xt
and the newer versions are very similar; if anything does not work, please refer to the Stata help system or the handbook. For models with metric dependent variables, I will present both the xtmixed
and the mixed
commands; for other models (to be presented further below) I will use the new commands only.
Prefatory note 2: Multilevel models can also be estimated with gllamm
, and therefore I will present a few examples that refer to this procedure. In examples that follow I always add the adapt
option (for adaptive quadrature). This way, results will be very close to those of (xt)mixed
.
Throughout, I will provide only the minimum of commands necessary to make things run. The numerous options available are not discussed here, at least for the time being, with a few exceptions. Note that with xtmixed
or mixed
, you can also use factor variables.
The data set I have in mind is a subsample from the NELS-88 study; is is used in the introductory book by Ita Kreft and Jan de Leeuw (Sage, 1998). MATH (scores obtained in mathematics) is the dependent variable; SCHID is the identifier for schools (level 2); HOMEWORK is the amount of homework in hours; and PUBLIC is a dummy variable for public school. Of course, more independent variables may be introduced. Other examples (particularly for categorical dependent variables) are completely made up, but still use (by way of fiction) the variables I have just described.
Models for a metric dependent variable
Basic model (estimation of variances only)
xtmixed MATH || SCHID:, variance
mixed MATH || SCHID:, variance
Up to and including Stata 11, xtmixed
used REML (restricted Maximum Likelihood) estimation by default. In version 12, and in the mixed
command, this has changed to standard ML estimation. Whatever the default, you may request standard ML with option mle
and REMLS with option reml
. The examples use the option variance
, which requests Stata to deliver variances on the first and second level instead of standard deviations.
In glamm
, it works like this:
gllamm MATH, i(SCHID) adapt
If you don't want to compute the percentage of variance on both levels by hand, you may use the xtreg
procedure:
xtreg MATH, re i(SCHID)
However, the estimates differ slightly from those of xtmixed
.
An individual level covariate, plus random intercept
xtmixed MATH HOMEWORK || SCHID:
mixed MATH HOMEWORK || SCHID:
gllamm MATH HOMEWORK, i(SCHID) adapt
Random intercept and random slope
xtmixed MATH HOMEWORK || SCHID: HOMEWORK, cov(unstruct)
mixed MATH HOMEWORK || SCHID: HOMEWORK, cov(unstruct)
With gllamm
, things become a bit complicated now. We have to create a constant, and we have to assign the constant and the covariate(s). So, all in all it looks like this:
gen cons=1
eq cons:cons
eq slope: HOMEWORK
gllamm MATH HOMEWORK, i(SCHID) nrf(2) eqs(cons slope) adapt
With, say, two individual level variables with random slopes, you may define, e.g., slope1
and slope2
. In addition, you will write nrf(3) eqs(cons slope1 slope2)
. Note that already such a relatively simple model will require very much patience on your part with gllamm
, particularly if option adapt
is used.
Random intercept, random slope plus a higher-level covariate
xtmixed MATH HOMEWORK PUBLIC || SCHID: HOMEWORK, cov(unstruct)
mixed MATH HOMEWORK PUBLIC || SCHID: HOMEWORK, cov(unstruct)
gllamm MATH HOMEWORK PUBLIC, i(SCHID) nrf(2) eqs(cons slope) adapt
Random intercept, random slope, higher-level covariate plus cross-level interaction
xtmixed MATH HOMEWORK PUBLIC c.HOMEWORK#c.PUBLIC|| SCHID: HOMEWORK, cov(unstruct)
The cross-level interaction is in the expression c.HOMEWORK#c.PUBLIC
(i.e. a factor variable; since PUBLIC is a 0-1 coded dummy variable, we may treat it like a continuous variable). As gllamm
cannot deal with factor variables, you will create the cross-level interaction as follows:
gen HOMEPUB = HOMEWORK * PUBLIC
gllamm MATH HOMEWORK PUBLIC HOMEPUB, i(SCHID) nrf(2) eqs(cons slope) adapt
Models for categorical dependent variables (short overview)
Basically, the models look very similar to those explained above. But unsurprisingly, some options are absent and other are available. I will present only the commands for simple models with one dependent and one independent variable.
Binary variables
Here, a couple of models are available which should yield similar results:
melogit DROPOUT HOMEWORK || SCHID:
meprobit DROPOUT HOMEWORK || SCHID:
mecloglog DROPOUT HOMEWORK || SCHID:
Ordinal variables
meologit GRADE HOMEWORK || SCHID:
meoprobit GRADE HOMEWORK || SCHID:
Count data
mepoisson AWOL HOMEWORK || SCHID:
menbreg AWOL HOMEWORK || SCHID:
Multinomial dependent variable
These models cannot be estimated with me
commands. However, the gsem
command may be deployed which currently is beyond the scope of this guide.
© W. Ludwig-Mayerhofer, Stata Guide | Last update: 21 Mar 2019