Binder Harald, Allignol Arthur, Schumacher Martin, Beyersmann Jan
Freiburg Center for Data Analysis and Modeling, University of Freiburg, Eckerstr 1 and Institute of Medical Biometry and Medical Informatics, University Medical Center Freiburg, Freiburg, Germany.
Bioinformatics. 2009 Apr 1;25(7):890-6. doi: 10.1093/bioinformatics/btp088. Epub 2009 Feb 25.
For analyzing high-dimensional time-to-event data with competing risks, tailored modeling techniques are required that consider the event of interest and the competing events at the same time, while also dealing with censoring. For low-dimensional settings, proportional hazards models for the subdistribution hazard have been proposed, but an adaptation for high-dimensional settings is missing. In addition, tools for judging the prediction performance of fitted models have to be provided.
We propose a boosting approach for fitting proportional subdistribution hazards models for high-dimensional data, that can e.g. incorporate a large number of microarray features, while also taking clinical covariates into account. Prediction performance is evaluated using bootstrap.632+ estimates of prediction error curves, adapted for the competing risks setting. This is illustrated with bladder cancer microarray data, where simultaneous consideration of both, the event of interest and competing events, allows for judging the additional predictive power gained from incorporating microarray measurements.
The proposed boosting approach is implemented in the R package CoxBoost and prediction error estimation in the package peperr, both available from CRAN.
为了分析具有竞争风险的高维生存时间数据,需要有针对性的建模技术,这些技术要同时考虑感兴趣的事件和竞争事件,还要处理删失问题。对于低维情况,已经提出了用于次分布风险的比例风险模型,但缺少针对高维情况的适配方法。此外,还必须提供用于判断拟合模型预测性能的工具。
我们提出了一种用于拟合高维数据比例次分布风险模型的提升方法,该方法例如可以纳入大量微阵列特征,同时也考虑临床协变量。使用自抽样法对预测性能进行评估,采用针对竞争风险设置进行了调整的预测误差曲线的.632 +估计值。通过膀胱癌微阵列数据对此进行了说明,其中同时考虑感兴趣的事件和竞争事件,可以判断纳入微阵列测量所获得的额外预测能力。
所提出的提升方法在R包CoxBoost中实现,预测误差估计在包peperr中实现,这两个包均可从CRAN获得。