Setting Time-to-Event Data and Doing First Analyses


For most analyses of time to event data, you first have to "stset" your data. This means that you provide information for Stata about the "duration" and the "event" variables and possible other things that are important. Note that events are termed "failures" in Stata's handbook and in the help system. The "event" (or failure) variable distinguishes observations where an event occurred from censored observations (no failure was observed).

In the case of single event data (i.e. one record per observation), typically you will use one of the following commands:

stset duration, failure(event)

informs Stata that time to event is stored in variable "duration"; it is assumed that the variable "event" has value 0 or missing in the case of censored observations, with all other values indicating failures (events).

stset duration, failure(event== 2 3)

informs Stata that time to event is stored in variable "duration" and that the "events" (or failures) are denoted by values 2 and 3, with all other values indicating censored observations.

If all observations end in failure (something that occurs very rarely), you may omit the failure option.

Note that there are many more options, not least for repeated event data. These are not yet covered here.

Survivor or (cumulative) hazard functions

sts graph

will produce the graph of the survivor function, estimated via the Kaplan-Meier procedure. You can see that no information is provided about time-to-event or censored observations; this has been done via the stset command described in the previous section.

sts graph, failure

will graph 1-S(t), whereas

sts graph, hazard by(group)

will graph a (smoothed) hazard rate for each value of variable group.

Note that as a further graph option, cumhaz will produce a graph of the cumulative hazard. The by option, mentioned here together with hazard, can be used for all graphs.

© W. Ludwig-Mayerhofer, Stata Guide | Last update: 29 May 2012