Reid Stephen, Tibshirani Rob
Department of Statistics Stanford University 390 Serra Mall Stanford, CA, United States of America.
J Stat Softw. 2014 Jul;58(12).
We apply the cyclic coordinate descent algorithm of Friedman, Hastie, and Tibshirani (2010) to the fitting of a conditional logistic regression model with lasso [Formula: see text] and elastic net penalties. The sequential strong rules of Tibshirani, Bien, Hastie, Friedman, Taylor, Simon, and Tibshirani (2012) are also used in the algorithm and it is shown that these offer a considerable speed up over the standard coordinate descent algorithm with warm starts. Once implemented, the algorithm is used in simulation studies to compare the variable selection and prediction performance of the conditional logistic regression model against that of its unconditional (standard) counterpart. We find that the conditional model performs admirably on datasets drawn from a suitable conditional distribution, outperforming its unconditional counterpart at variable selection. The conditional model is also fit to a small real world dataset, demonstrating how we obtain regularization paths for the parameters of the model and how we apply cross validation for this method where natural unconditional prediction rules are hard to come by.
我们将Friedman、Hastie和Tibshirani(2010)提出的循环坐标下降算法应用于具有套索[公式:见原文]和弹性网络惩罚的条件逻辑回归模型的拟合。Tibshirani、Bien、Hastie、Friedman、Taylor、Simon和Tibshirani(2012)的顺序强规则也用于该算法,结果表明,与具有热启动的标准坐标下降算法相比,这些规则能显著提高速度。一旦实现,该算法将用于模拟研究,以比较条件逻辑回归模型与其无条件(标准)对应模型在变量选择和预测性能方面的差异。我们发现,条件模型在从合适的条件分布中抽取的数据集上表现出色,在变量选择方面优于其无条件对应模型。条件模型还被应用于一个小型现实世界数据集,展示了我们如何获得模型参数的正则化路径,以及在难以获得自然无条件预测规则的情况下如何对该方法应用交叉验证。