Selecting Cases

In many instances, analysis is required only for a defined subset of your data. You may select cases for analysis in two ways: The FILTER BY command temporarily selects cases that fulfil a condition that usually is (but has not necessarily to be) defined in a yes/no mode (with yes = 1 and no = 0); it can be undone by the command FILTER OFF. The SELECT IF command deletes the cases that do not fulfil a given condition from the working file, and thus cannot be "undone". (However, rest assured that SELECT IF affects only your working file, not the original data set).

Examples

COMPUTE filter_v = (var17 eq 1 and var23 lt 23).
FILTER BY filter_v.

SELECT IF (var17 eq 1 and var23 lt 23).


FILTER BY: Temporary selection

How the FILTER BY command works can be easily seen in the data window. The cases that do not meet the filtering condition are "barred", as it were, from the analysis; however, they may re-enter at any time. FILTERing BY a variable (which may have any name) means that all cases with value 0 or with a missing value in that variable will not enter the analyses that follow until the command is revoked by the FILTER OFF command. All other cases will remain in the analyses.

Only one FILTERing variable can be listed in the command; therefore, if the filtering condition is a combination of several data values, you have to define a filtering variable accordingly.

Note that FILTERing does not affect all procedures; for instance, when you SAVE your data file, all cases will be saved, not only those filtered. Also, data transformations will be performed on all cases.


SELECT IF: "Permanent" selection

To SELECT cases means that all other cases are deleted from the working file. This command therefore is useful for saving computation time; if you are doing a certain amount of analyses on a part of your data only, if will be reasonable to SELECT that part. However, you must keep in mind that when saving your data, any file with that name will be overwritten with the reduced data set. Therefore, usually you will take care that you will either not save your reduced data set or that you will save it under a different name.

In the SELECT IF command, any condition can be defined for selecting cases. That is, you may use all the possibilities for conditional transformations to define the condition by which cases are to be selected right after the SELECT IF keyword, whereas in FILTERing cases these conditions have to be defined prior to the FILTERing command to yield the single filtering variable.

© W. Ludwig-Mayerhofer, IGSW | Last update: 24 Apr 1998