Sharma Dhruv B, Bondell Howard D, Zhang Hao Helen
J Comput Graph Stat. 2013 Apr 1;22(2):319-340. doi: 10.1080/15533174.2012.707849.
Statistical procedures for variable selection have become integral elements in any analysis. Successful procedures are characterized by high predictive accuracy, yielding interpretable models while retaining computational efficiency. Penalized methods that perform coefficient shrinkage have been shown to be successful in many cases. Models with correlated predictors are particularly challenging to tackle. We propose a penalization procedure that performs variable selection while clustering groups of predictors automatically. The oracle properties of this procedure including consistency in group identification are also studied. The proposed method compares favorably with existing selection approaches in both prediction accuracy and model discovery, while retaining its computational efficiency. Supplemental material are available online.
变量选择的统计程序已成为任何分析中不可或缺的元素。成功的程序具有高预测准确性的特点,能产生可解释的模型,同时保持计算效率。已证明在许多情况下,执行系数收缩的惩罚方法是成功的。处理具有相关预测变量的模型尤其具有挑战性。我们提出一种惩罚程序,该程序在自动对预测变量组进行聚类的同时执行变量选择。还研究了该程序的神谕属性,包括组识别的一致性。所提出的方法在预测准确性和模型发现方面与现有选择方法相比具有优势,同时保持其计算效率。补充材料可在线获取。