Habbema J D, Gelpke G J
Comput Programs Biomed. 1981 Sep-Dec;13(3-4):251-70. doi: 10.1016/0010-468x(81)90103-3.
The computer program INDEP-SELECT has been developed for selection of an optimal subset from a set of possibly informative diagnostic or prognostic variables. But the program is equally useful for other discriminant analysis or pattern recognition problems involving variable selection. The approach is probabilistic; i.e., diagnostic probabilities are assigned to a patient on the basis of the values observed on the diagnostic variables. The statistical model used is largely based on the assumption of independency between the variables, but one model-parameter, the so-called 'global association factor', is added in order to take dependency into account. The stepwise forward selection strategy of adding in each selection step a new variable to the set of already selected variables, is used. The user may choose between a number of selection criteria. Such a criterion is used in order to decide in each selection step which variable should be added. All criteria are based on measures of diagnostic or prognostic performance. INDEP-SELECT is able to handle a large number of variables, also with missing data, and a large number of patients. The program is written in ANS Standard FORTRAN, and takes relatively little computation time.