Suppr超能文献

评估预测模型性能的改善情况。

Testing for improvement in prediction model performance.

机构信息

Biostatistics and Biomathematics, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA.

出版信息

Stat Med. 2013 Apr 30;32(9):1467-82. doi: 10.1002/sim.5727. Epub 2013 Jan 7.

Abstract

Authors have proposed new methodology in recent years for evaluating the improvement in prediction performance gained by adding a new predictor, Y, to a risk model containing a set of baseline predictors, X, for a binary outcome D. We prove theoretically that null hypotheses concerning no improvement in performance are equivalent to the simple null hypothesis that Y is not a risk factor when controlling for X, H0 : P(D = 1 | X,Y ) = P(D = 1 | X). Therefore, testing for improvement in prediction performance is redundant if Y has already been shown to be a risk factor. We also investigate properties of tests through simulation studies, focusing on the change in the area under the ROC curve (AUC). An unexpected finding is that standard testing procedures that do not adjust for variability in estimated regression coefficients are extremely conservative. This may explain why the AUC is widely considered insensitive to improvements in prediction performance and suggests that the problem of insensitivity has to do with use of invalid procedures for inference rather than with the measure itself. To avoid redundant testing and use of potentially problematic methods for inference, we recommend that hypothesis testing for no improvement be limited to evaluation of Y as a risk factor, for which methods are well developed and widely available. Analyses of measures of prediction performance should focus on estimation rather than on testing for no improvement in performance.

摘要

近年来,作者提出了新的方法来评估通过向包含一组基线预测因子 X 的风险模型中添加新的预测因子 Y 来提高对二项结局 D 的预测性能。我们从理论上证明,关于没有性能提高的零假设等同于当控制 X 时 Y 不是风险因素的简单零假设,H0:P(D=1|X,Y)=P(D=1|X)。因此,如果已经证明 Y 是一个风险因素,那么对预测性能提高的检验就是多余的。我们还通过模拟研究调查了检验的性质,重点关注 ROC 曲线下面积 (AUC) 的变化。一个意外的发现是,不调整估计回归系数变异性的标准检验程序极其保守。这可能解释了为什么 AUC 被广泛认为对预测性能的提高不敏感,并表明不敏感的问题与用于推断的无效程序有关,而不是与该措施本身有关。为了避免冗余检验和使用可能存在问题的推断方法,我们建议将无改进检验的假设限制在 Y 作为风险因素的评估,因为已经开发并广泛提供了针对该因素的方法。预测性能的度量分析应侧重于估计,而不是对性能无改进的检验。

相似文献

1
Testing for improvement in prediction model performance.评估预测模型性能的改善情况。
Stat Med. 2013 Apr 30;32(9):1467-82. doi: 10.1002/sim.5727. Epub 2013 Jan 7.
2
Estimating the capacity for improvement in risk prediction with a marker.评估利用一个标志物改善风险预测的能力。
Biostatistics. 2009 Jan;10(1):172-86. doi: 10.1093/biostatistics/kxn025. Epub 2008 Aug 19.
3
Misuse of DeLong test to compare AUCs for nested models.误用 Delong 检验比较嵌套模型的 AUC。
Stat Med. 2012 Oct 15;31(23):2577-87. doi: 10.1002/sim.5328. Epub 2012 Mar 13.
5
Impact of correlation on predictive ability of biomarkers.相关性对生物标志物预测能力的影响。
Stat Med. 2013 Oct 30;32(24):4196-210. doi: 10.1002/sim.5824. Epub 2013 May 3.

引用本文的文献

本文引用的文献

2
Misuse of DeLong test to compare AUCs for nested models.误用 Delong 检验比较嵌套模型的 AUC。
Stat Med. 2012 Oct 15;31(23):2577-87. doi: 10.1002/sim.5328. Epub 2012 Mar 13.
9
Two criteria for evaluating risk prediction models.评估风险预测模型的两个标准。
Biometrics. 2011 Sep;67(3):1057-65. doi: 10.1111/j.1541-0420.2010.01523.x. Epub 2010 Dec 14.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验