Linear Regression and Some Alternatives
Simple example
regress DEPVAR INDVAR1 INDVAR2 INDVAR3, beta
Keyword beta is required if you want to obtain standardized regression coefficients.
Example with estimation of robust (Huber-White) standard errors
regress DEPVAR INDVAR1 INDVAR2 INDVAR3, beta robust
More
Regression diagnostics and much else can be obtained after estimation of a regression model. Note that some statistics and plots will not work with survey data, i.e. if the svy
option (see complex samples) was used. Here are some useful post-estimation commands:
estat hettest | Breusch-Pagan/Cook-Weisberg test for heteroskedasticity. |
estat vif |
1/VIF for the independent variables. |
rvfplot | will display a plot of residuals vs. fitted values (helpful for assessing heteroskedasticity). |
avplots | will produce a tableau of added variable plots for all independen variables. |
avplot experience | will display an added variable plot for variable "experience". | avplot 3.group | will display an added variable plot for the dummy variable that represents the category coded "3" of variable "group" (not the third value of this variable). |
cprplot experience | will produce a component plus residual plot for variable "experience". Options for this plot are available, such as "lowess" or "mspline". Note that an "augmented component plus residual plot" is available with command acprplot . It is said to do better in detecting non-linearity. |
predict cd1, cooksd | saves the values of Cook's d in variable "cd1". |
dfbeta | computes dfbeta for all independent variables and stores the values in variables whose names are given in the output. |
predict dfbe1, dfbeta(educ) | saves the values of dfbeta for variable "educ" in variable "dfbe1". |
estat ic | displays the values of AIC and BIC in the output. |
collin x1 x2 x3 | produces additional statistics about collinearity, e.g., eigenvalues, condition number and the determinant of the correlation matrix. Note that collin is an ado file which has to be downloaded (start with findit collin ). |
Alternatives to the regress command
Two or more dependent variables
You may estimate models where two or more dependent variables are regressed on the same set of predictors. The advantage over a series of regressions with a single dependent variable is that you may test effects across regression equations. I cannot go into details here and will leave you just with the basic command:
mvreg depvar1 depvar2 = ivar1 ivar2 ivar3
You will not always want to use the same set of predictors, and in this case, a procedure called "seemingly unrelated regression" is the method of choice.
sureg (depvar1 ivar1 ivar2) (depvar2 ivar2 ivar3)
Ridge regression
Some people recommend "ridge regression", particularly if collinearity is high (many others do not recommend it!). If you want to give it a try, there is an ado file ridgereg
which may be obtained via findit ridgereg
.
© W. Ludwig-Mayerhofer, Stata Guide | Last update: 26 Feb 2018