T-test

The t distribution, developed by "Student" (a pseudonym of W. Gosset) more than 100 years ago, is used for a number of testing purposes. The procedure commonly called t-test, however, refers to a test of the difference between two means (one of which might be a hypothetical value against which the mean of an observed variable is tested).

T-test for two independent samples (groups)

The t-test is often used to compare the means of two groups. This works as follows:

ttest income, by(married)

There are a few options that can be appended: unequal (or un) informs Stata that the variances of the two groups are to be considered as unequal; welch (or w) requests Stata to use Welch's approximation to the t-test (which has the nearly the same effect as unequal; only the d.f. are different) and finally, with level(99) (abbreviated as l(99)) you can, in this case, request a confidence level of 99 per cent instead of the default level of 95, which is used in the calculation of confidence intervals.

How do you know whether the two groups have the same variances? Use

sdtest income, by(married)

to obtain the Bartlett test for equality of variances, or

robvar income, by(married)

which delivers a robust test proposed by Levene in 1960 and two alternatives by Brown & Forsythe in 1974. One of these alternatives uses the median instead of the mean in Levene's original formula and the other one the 10 per cent trimmed mean. These robust tests are more appropriate in the case of skewed variables.

T-test for paired means

Sometimes the two means to be compared come from the same group of observations, for instance, from measurements at points in time t1 and t2. Here, the appropriate version of the t-test is:

ttest incomet1 == incomet2

Note that Stata will also accept a single equal sign. The level(..) option described in the previous section is available as well.

T-test to compare one mean with a hypothetical value (one sample t-test)

Here, the command goes like this:

ttest IQ = 110

Note that Stata will also accept a pair of equal signs. Again, the level(..) option is available.

Immediate form of the t-test

Another interesting possibility is to do t-tests using information about group sizes, means, and standard deviations, instead of the raw data. This information may be entered immediately with the ttesti command, with the appended "i" signalling the "immediate" variety of the t-test.

Finally, Stata offers the possibility of running Hotelling's generalized t-test. See Stata help for more detail.