Bursac Zoran, Gauss C Heath, Williams David Keith, Hosmer David W
Biostatistics, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA.
Source Code Biol Med. 2008 Dec 16;3:17. doi: 10.1186/1751-0473-3-17.
The main problem in many model-building situations is to choose from a large set of covariates those that should be included in the "best" model. A decision to keep a variable in the model might be based on the clinical or statistical significance. There are several variable selection algorithms in existence. Those methods are mechanical and as such carry some limitations. Hosmer and Lemeshow describe a purposeful selection of covariates within which an analyst makes a variable selection decision at each step of the modeling process.
In this paper we introduce an algorithm which automates that process. We conduct a simulation study to compare the performance of this algorithm with three well documented variable selection procedures in SAS PROC LOGISTIC: FORWARD, BACKWARD, and STEPWISE.
We show that the advantage of this approach is when the analyst is interested in risk factor modeling and not just prediction. In addition to significant covariates, this variable selection procedure has the capability of retaining important confounding variables, resulting potentially in a slightly richer model. Application of the macro is further illustrated with the Hosmer and Lemeshow Worchester Heart Attack Study (WHAS) data.
If an analyst is in need of an algorithm that will help guide the retention of significant covariates as well as confounding ones they should consider this macro as an alternative tool.
在许多模型构建情形中,主要问题是从大量协变量中选择那些应包含在“最佳”模型中的变量。决定在模型中保留某个变量可能基于临床意义或统计显著性。现有多种变量选择算法。这些方法较为机械,因此存在一些局限性。霍斯默和莱梅肖描述了一种有目的的协变量选择方法,在该方法中,分析师在建模过程的每个步骤做出变量选择决策。
在本文中,我们介绍一种使该过程自动化的算法。我们进行了一项模拟研究,以将此算法的性能与SAS PROC LOGISTIC中三种记录详尽的变量选择程序进行比较:向前选择法、向后选择法和逐步选择法。
我们表明,当分析师对风险因素建模感兴趣而非仅仅是预测时,这种方法具有优势。除了显著的协变量外,此变量选择程序还能够保留重要的混杂变量,从而可能产生一个稍丰富一些的模型。通过霍斯默和莱梅肖伍斯特心脏病发作研究(WHAS)数据进一步说明了该宏程序的应用。
如果分析师需要一种有助于指导保留显著协变量以及混杂协变量的算法,他们应将此宏程序视为一种替代工具。