Guo Zifang, Lu Wenbin, Li Lexin
Department of Statistics, North Carolina State University, Raleigh, NC, 27695, USA
Stat Biosci. 2015 Oct 1;7(2):225-244. doi: 10.1007/s12561-014-9114-4. Epub 2014 Apr 30.
Despite enormous development on variable selection approaches in recent years, modeling and selection of high dimensional censored regression remains a challenging question. When the number of predictors far exceeds the number of observational units and the outcome is censored, computations of existing solutions often become difficult, or even infeasible in some situations, while performances frequently deteriorate. In this article, we aim at simultaneous model estimation and variable selection for Cox proportional hazards models with high dimensional covariates. We propose a forward stage-wise shrinkage and addition approach for that purpose. Our proposal extends a popular statistical learning technique, the boosting method. It inherits the flexible nature of boosting and is straightforward to extend to nonlinear Cox models. Meanwhile it advances the classical boosting method by adding explicit variable selection and substantially reducing the number of iterations to the algorithm convergence. Our intensive simulations have showed that the new method enjoys a competitive performance in Cox models with both < and ≥ scenarios. The new method was also illustrated with analysis of two real microarray survival datasets.
尽管近年来变量选择方法有了巨大发展,但高维删失回归的建模和选择仍然是一个具有挑战性的问题。当预测变量的数量远远超过观测单位的数量且结果被删失时,现有解决方案的计算通常会变得困难,甚至在某些情况下不可行,同时性能也经常下降。在本文中,我们旨在对具有高维协变量的Cox比例风险模型进行模型估计和变量选择。为此,我们提出了一种向前逐步收缩和添加方法。我们的提议扩展了一种流行的统计学习技术——提升方法。它继承了提升方法的灵活性,并且很容易扩展到非线性Cox模型。同时,它通过添加显式变量选择并大幅减少算法收敛的迭代次数,改进了经典的提升方法。我们的大量模拟表明,新方法在(n < p)和(n ≥ p)两种情况下的Cox模型中都具有有竞争力的性能。通过对两个真实的微阵列生存数据集的分析,也展示了新方法。