Internet Guide to Stata |
Print article |
If you want to use an already existing Stata system file (with extension ".dta"), the appropriate command is
use name-of-data-file
If you are already using another data set and want to replace it (either having saved it or not wanting to save it) with the new data set, you will write:
use name-of-data-file, clear
Up to now, I have assumed that the data are in your working directory, which normally is called "data" on a Windows PC. If the data set can be found somewhere else, you may write, for instance
use c:\mydirectory\mysubdirectory\name-of-data-file
where you have to fill in your directory and data set name. Another way is to change to the pertinent directory first and then to "use" the data file:
cd c:\mydirectory\mysubdirectory\
use name-of-data-file
Important note: If one or several of the directories or the name of the data set contain empty spaces (blanks), they have to be placed within double quotation marks. Single quotation marks won't do the trick.
If you know from the outset that you need only parts of a data set, you may request Stata to limit the data to be loaded. "Limiting" the data may refer to the variables used and/or to the selection of a subsample of cases. Look at the following examples:
use var1 var17 var38 using name-of-data-file
will load only the three variables mentioned into your working memory.
use if id <= 1000 using name-of-data-file
will load only cases with a value less than or equal to 1000 in variable id.
Both types of command may be combined, such as in
use var1 var17 if id <= 1000 using name-of-data-file
By default, Stata is not prepared to read data from other statistical packages. However, there are a number of workarounds. One of these is using software designed specifically to convert files from one format to another, such as Stat/Transfer or DBMS copy. But there are ways of doing without.
If you have access to SPSS, you may save your file in Stata format and then use this version in Stata. However, with IBM-SPSS version 19 problems may occur with the file simply not being written. Note that with later versions (currently, this is version IBM SPSS 22) things seem to work well.
With Stata 12 or some earlier version, the easier way is to use the user-written routine usespss. If you have not yet installed it, type
findit usespss
and follow the directions given. Once usespss is installed, you can read your SPSS file with the following command:
usespss using spssfilename.sav, clear
Warning: At the time of this writing (May 2015), the procedure is not available for the 64 bit version of Stata 13.
Stata has a function to read data that come in SAS XPORT Transport format. See help fdause (the name of the command is derived from the fact that the US FDA requires this format).
Software like MicroSoft Excel™, but also many other programs, such as SPSS, can create so-called ASCII files that can be read by any software. But as of version 12, Stata can import Excel™-files directly.
import excel name-of-data-file, firstrow clear
will import the first sheet from file "name-of-data-file", assuming that the first row contains the variable names. If the data you wish to import are not in the first sheet, try adding the option sheet("name-of-sheet"). There are other options as well; e.g., you might restrict import to some rows and colums.
Stata can also export to Excel™ files, but as yet I did have no reason to try this, so please find out for yourself how this works.
An ASCII file contains the data (one case per line), a delimiter to separate data, possibly the names of the variables in the very first line of the file, but no formatting or other stuff. Such files can easily read by Stata.
insheet using name-of-data-file, c n clear
will read an ASCII file with comma separated values and names in the first line. Other options are:
t for tab delimited data
delim("X") for data delimited by X.
"X" may be exchanged by any other character.
Stata can read ASCII files with no delimiters between colums. These may come in free format (here, columns have to be separated by spaces) or in fixed format (where for each column, i.e. variable, an exact position can be given). For more information, see help infile of help infix.
© W. Ludwig-Mayerhofer, Stata Guide | Last update: 22 May 2015