Main Page TDA

Wolfgang Ludwig-Mayerhofer's Introduction to TDA

Non-parametric analyses

The two non-parametric analyses are requested with the

ltb( );

and the

ple( );

command, respectively. Note that you will find these procedures in chapter 6.5 of the TDA manual under the heading "Describing Episode Data". I do not quite agree with this characterization, as these procedures also involve test statistics for differences between groups.

The basic structure of the command for the life table estimator is as follows:

ltb (
	tp = ...,
	grp = ...,
	) = name_of_output_file;

The tp command defines time periods, i.e. intervals into which the duration data are grouped. If all intervals are of the same width, a convenient way of defining the intervals is:

tp = starttime (increment) endtime,

with appropriate numbers replacing the arguments. In our example data, the longest duration is 47 months, and by giving the command

tp = 0 (1) 47,

there will be one row in the life table for each month. As several months do not show any transition, it may be meaningful to use an increment of two months, i.e. grouping two months into one interval with the command:

tp = 0 (2) 48,

You may also define intervals explicitly, such as

tp = 0, 1, 2, 4, 8, 10, 12, 16,

Both methods can be combined as, for instance, in the statement:

tp = 0 (2) 24, 30, 36, 48,

This is helpful when you want larger intervals on the right tail, for instance, because there are only few data points.

In comparing groups by means of non-parametric methods (life table or product limit estimators), you do not use "covariates" in the strict sense, but rather indicators of membership in groups. Note that group membership must be defined explicitly as follows. All cases that have a value not equal to zero in a variable are supposed to belong to one group. However, the cases with value zero are not automatically treated as another group. If you want, for instance, to compare males and females, you have to define - in the nvar ( ); section - two variables from column 4 of the data set as follows:

V11 (MALE) = c4[0],
V12 (FEMALE) = c4[1],

To indicate which variables are to be used in the comparisons, you will then use the grp subcommand, for instance, as follows:

grp = V11,V12,

Note, finally, that you have to indicate the name of a file to which the life table estimators are to be written. The usual TDA output file will only inform you about the variables you have built from the data file, about the numbers of episodes and the groups for your analyses. The reason for this perhaps awkward procedure is that it is easier to create plots from the life tables if these are in an appropriate file. (However, I will not treat this topic here.)

The basic structure of the command for the Kaplan-Meier or product-limit estimator is as follows:

ple (
	grp = ...,
	csf,
	) = name_of_output_file;

In this procedure, exact duration times are used and therefore no indication of time intervals is necessary. The grp subcommand works exactly in the same way as in the life table procedure, and likewise you have to indicate an additional output file for the estimation results. The csf subcommand requests four common test statistics for the comparison of the groups defined in the grp subcommand.

Large datasets with exact measurements of durations and long observation time may result in a very large output of the Kaplan-Meier estimation, as each point in time with an event or a censored observation will yield one row in the output. In this case, it may be convenient to have a rough overview of the results. This can be obtained by requesting quantiles of the survivor function, say, deciles, with the qo subcommand, as shown in the following example:

ple (
	grp = ...,
	csf,
	qo=0.9(0.1)0.1,
	) = name_of_output_file;

For each group, TDA will show (in the standard output) the time at which 90, 90, 70 ... down to 10 percent of the respective group have "survived". Other starting and ending points and other quantiles are available, of course.

Last update: 28 Jan 2000