Suppr超能文献

使用预测误差曲线评估随机森林用于生存分析

Evaluating Random Forests for Survival Analysis using Prediction Error Curves.

作者信息

Mogensen Ulla B, Ishwaran Hemant, Gerds Thomas A

机构信息

Department of Biostatistics, University of Copenhagen, Denmark.

Department of Epidemiology and Public Health, University of Miami, USA.

出版信息

J Stat Softw. 2012 Sep;50(11):1-23. doi: 10.18637/jss.v050.i11.

Abstract

Prediction error curves are increasingly used to assess and compare predictions in survival analysis. This article surveys the R package which provides a set of functions for efficient computation of prediction error curves. The software implements inverse probability of censoring weights to deal with right censored data and several variants of cross-validation to deal with the apparent error problem. In principle, all kinds of prediction models can be assessed, and the package readily supports most traditional regression modeling strategies, like Cox regression or additive hazard regression, as well as state of the art machine learning methods such as random forests, a nonparametric method which provides promising alternatives to traditional strategies in low and high-dimensional settings. We show how the functionality of can be extended to yet unsupported prediction models. As an example, we implement support for random forest prediction models based on the R-packages and . Using data of the Copenhagen Stroke Study we use to compare random forests to a Cox regression model derived from stepwise variable selection. Reproducible results on the user level are given for publicly available data from the German breast cancer study group.

摘要

预测误差曲线在生存分析中越来越多地用于评估和比较预测结果。本文介绍了一个R包,它提供了一组用于高效计算预测误差曲线的函数。该软件实现了用于处理右删失数据的删失权重逆概率以及用于处理明显误差问题的几种交叉验证变体。原则上,可以评估各种预测模型,并且该包很容易支持大多数传统回归建模策略,如Cox回归或加法风险回归,以及诸如随机森林等先进的机器学习方法,随机森林是一种非参数方法,在低维和高维设置中为传统策略提供了有前景的替代方案。我们展示了如何将该包的功能扩展到尚未得到支持的预测模型。例如,我们基于R包和实现了对随机森林预测模型的支持。使用哥本哈根中风研究的数据,我们使用该包将随机森林与通过逐步变量选择得出的Cox回归模型进行比较。针对德国乳腺癌研究组的公开可用数据,在用户层面给出了可重现的结果。

相似文献

10
OBLIQUE RANDOM SURVIVAL FORESTS.倾斜随机生存森林
Ann Appl Stat. 2019 Sep;13(3):1847-1883. doi: 10.1214/19-aoas1261. Epub 2019 Oct 17.

引用本文的文献

本文引用的文献

2
Confidence scores for prediction models.预测模型的置信度分数。
Biom J. 2011 Mar;53(2):259-74. doi: 10.1002/bimj.201000157. Epub 2011 Feb 17.
3
Testing the prediction error difference between 2 predictors.测试两个预测指标之间的预测误差差异。
Biostatistics. 2009 Jul;10(3):550-60. doi: 10.1093/biostatistics/kxp011. Epub 2009 Apr 20.
6
Efron-type measures of prediction error for survival analysis.用于生存分析的预测误差的埃弗龙型度量。
Biometrics. 2007 Dec;63(4):1283-7. doi: 10.1111/j.1541-0420.2007.00832.x. Epub 2007 Jul 25.
9
Survival ensembles.生存集成法。
Biostatistics. 2006 Jul;7(3):355-73. doi: 10.1093/biostatistics/kxj011. Epub 2005 Dec 12.
10
Prediction error estimation: a comparison of resampling methods.预测误差估计:重采样方法的比较
Bioinformatics. 2005 Aug 1;21(15):3301-7. doi: 10.1093/bioinformatics/bti499. Epub 2005 May 19.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验