Goeman Jelle J
Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands.
Biom J. 2010 Feb;52(1):70-84. doi: 10.1002/bimj.200900028.
This article presents a novel algorithm that efficiently computes L(1) penalized (lasso) estimates of parameters in high-dimensional models. The lasso has the property that it simultaneously performs variable selection and shrinkage, which makes it very useful for finding interpretable prediction rules in high-dimensional data. The new algorithm is based on a combination of gradient ascent optimization with the Newton-Raphson algorithm. It is described for a general likelihood function and can be applied in generalized linear models and other models with an L(1) penalty. The algorithm is demonstrated in the Cox proportional hazards model, predicting survival of breast cancer patients using gene expression data, and its performance is compared with competing approaches. An R package, penalized, that implements the method, is available on CRAN.
本文提出了一种新颖的算法,该算法能高效地计算高维模型中参数的L(1)惩罚(套索)估计值。套索具有同时进行变量选择和收缩的特性,这使得它在高维数据中寻找可解释的预测规则时非常有用。新算法基于梯度上升优化与牛顿-拉弗森算法的结合。它针对一般的似然函数进行了描述,可应用于广义线性模型和其他具有L(1)惩罚的模型。该算法在Cox比例风险模型中得到了验证,使用基因表达数据预测乳腺癌患者的生存率,并将其性能与其他竞争方法进行了比较。一个实现该方法的R包“penalized”可在CRAN上获取。