Frequency Tables
Frequency tables display the values of a variable, weighted with the number of occurrences of each single value. In addition, percentages are displayed. Histograms, ntiles, percentiles, and sample statistics may be requested.
Simple example
FRE var17.
More complex example
FRE var17 var29 to var31 var217
/ HIST
/ FORMAT NOT
/ PERCENTILES = 2.5 10 25 75 90 97.5
/ STAT mean median stddev var.
Computing frequencies is often the first step after having entered the data. First, you may wish to check the values of a variable with respect to weird or outright impossible entries. Second, it is always good to have a rough idea about the distribution of a variable. Third, SPSS will, on request, display sample statistics in addition to (or instead of) the frequency table. However, if you are particularly interested in the shape of the distribution or in the presence of outlying values, graphical means should also be deployed. The histogram that can be requested with the keyword HISTOGRAM
(abbreviated as HIST
in the more complex example above) sometimes is not the single best graphical display for that purpose; at least for metric variables, stem-and-leaf displays and box-and-whisker plots should also be used (see Exploratory Data Analysis). A bar diagram is also available with keyword BAR instead of the histogram. If a variable has a large number of values and checking for weird entries has been accomplished, it may be useful to suppress the frequency table by the keyword FORMAT NOTABLE
(abbreviated as FORMAT NOT
).
Ntiles and percentiles are also available. If the command NTILES
= n is
included, the distribution of the variable(s) will be divided into n parts (with each part
containing N/n values, with N being the number of cases with valid values). With PERCENTILES = value1 value2 ... valueX
you may get any percentile you wish. As can be seen in the more complex example above, numbers referring to percentiles may have decimal values.
The keyword STATISTICS
(or STAT
) can be used to request sample statistics. The following statistics are available:
Keyword | Display |
---|---|
All |
All of the following statistics |
mean |
Sample mean |
median |
Sample median |
mode |
Sample mode |
min |
Sample minimum |
max |
Sample maximum |
range |
Sample range |
sum |
Sum of all values |
stddev |
Sample standard deviation |
var |
Sample variance |
skew |
Sample skewness |
kurt |
Sample kurtosis |
semean |
Standard error for estimation of population mean |
Note that "sample variance" and "sample standard deviation" means that the estimated value of the variance and the standard deviation for a population is computed. If your data do not come from a sample but rather represent the entire population, these values need to be corrected by the factor (n-1)/n.
Display options
FRE var17/FORMAT DVALUE.
will change the order in which the values are displayed: On the top, you will find the highest value, with the other values in descending order. You may likewise wish to order values according to the frequency with which they occur; use FORMAT AFREQ
for ascending order and FORMAT DFREQ
for descending order.
Other charts
Instead of, or in addition to, the HISTOGRAM (or HIST)
subcommand, you may use BAR
and/or PIE
to produce the pertinent chart (but note that you cannot request both a bar chart and a histogram). You may also indicate a mininmum and/or a maximum value that is displayed; also, with bar charts and pie charts you may request that the vertical axis is scaled in percent instead of frequencies (which is the default). Thus,
FRE age/FORMAT NOT/BAR=MIN(18) MAX(60) PERC.
will display the bar chart for all people in the data set in the range of 18 to 60 years and will have percentages given on the y axis. Note that the minimum and maximum values indicated do not refer to the minimum and maximum value of the x axis; the x axis may have a wider range.
© W. Ludwig-Mayerhofer, IGSW | Last update: 15 May 2010