Reordering or Re-arranging Data

Re-order variables within a data set

order id gender income

will place the three variables mentioned at the front of the data set (i.e. to the columns at the utmost left).

move gender income

will place variable gender at the current column of income and will move all variables, including income, one column to the right.

Transpose data

Transposing a matrix means to interchange rows and columns. That is, the values of any given row in the original data will end up in a column, and vice versa.

The appropriate Stata command is xpose. The minimum version is

xpose, clear

with the option , clear being required as a reminder that the resulting data set will replace the original data set in the memory; in other words, the original data set will be lost unless it is saved prior to the transposition.

Option varname will add a variable called _varname that contains the variable names of the original data set. This will be helpful in case you want to transpose the data back to the original format.

Change between long and wide format

Suppose your data consist of cases which contain sub-cases, as it were. Examples are: families (cases) and their members (sub-cases), countries (cases) and observations over a couple of years for each country (sub-cases).

Such data can come in two formats: Long, where all sub-cases are stacked beneath each other, and wide, where sub-cases are arranged side by side. reshape will help you to switch between these formats.

To understand how it works, I will start with an example in long format. Suppose you have collected data about heterosexual couples; for each partner there is a number of variables, in our example "income", "savings" and "debts" only (for simplicity's sake). The partners are distinguished by variable "sex" with values 1 and 2, and there is a variable "coupleid" that distinguishes between couples.

reshape wide income-debts, i(coupleid) j(sex)

will re-arrange that data in wide format, with two variables for income, income1 and income2, and so forth.

Note: Be sure that in your data set, variables "coupleid" and "sex" are not placed somewhere between "income" and "debts", because in this case Stata will act as if they did not exist.

© W. Ludwig-Mayerhofer, Stata Guide | Last update: 01 Aug 2015