Suppr超能文献

一项统计检验足以评估新的预测标志物。

One statistical test is sufficient for assessing new predictive markers.

机构信息

Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, Box 44, New York, NY 10065 USA.

出版信息

BMC Med Res Methodol. 2011 Jan 28;11:13. doi: 10.1186/1471-2288-11-13.

Abstract

BACKGROUND

We have observed that the area under the receiver operating characteristic curve (AUC) is increasingly being used to evaluate whether a novel predictor should be incorporated in a multivariable model to predict risk of disease. Frequently, investigators will approach the issue in two distinct stages: first, by testing whether the new predictor variable is significant in a multivariable regression model; second, by testing differences between the AUC of models with and without the predictor using the same data from which the predictive models were derived. These two steps often lead to discordant conclusions.

DISCUSSION

We conducted a simulation study in which two predictors, X and X*, were generated as standard normal variables with varying levels of predictive strength, represented by means that differed depending on the binary outcome Y. The data sets were analyzed using logistic regression, and likelihood ratio and Wald tests for the incremental contribution of X* were performed. The patient-specific predictors for each of the models were then used as data for a test comparing the two AUCs. Under the null, the size of the likelihood ratio and Wald tests were close to nominal, but the area test was extremely conservative, with test sizes less than 0.006 for all configurations studied. Where X* was associated with outcome, the area test had much lower power than the likelihood ratio and Wald tests.

SUMMARY

Evaluation of the statistical significance of a new predictor when there are existing clinical predictors is most appropriately accomplished in the context of a regression model. Although comparison of AUCs is a conceptually equivalent approach to the likelihood ratio and Wald test, it has vastly inferior statistical properties. Use of both approaches will frequently lead to inconsistent conclusions. Nonetheless, comparison of receiver operating characteristic curves remains a useful descriptive tool for initial evaluation of whether a new predictor might be of clinical relevance.

摘要

背景

我们已经观察到,接收器工作特性曲线(AUC)下的面积越来越多地被用于评估新的预测因子是否应该纳入多变量模型以预测疾病风险。通常,研究人员会分两个阶段来解决这个问题:首先,通过检验新预测变量在多变量回归模型中的显著性;其次,使用从预测模型得出的数据来检验有无预测因子的模型的 AUC 之间的差异。这两个步骤常常导致不一致的结论。

讨论

我们进行了一项模拟研究,其中两个预测因子 X 和 X* 作为标准正态变量生成,具有不同的预测强度水平,其均值取决于二项式结果 Y。使用逻辑回归分析数据集,并对 X* 的增量贡献进行似然比和 Wald 检验。然后,将每个模型的患者特定预测因子用作比较两个 AUC 的检验数据。在零假设下,似然比和 Wald 检验的大小接近名义值,但面积检验非常保守,在所有研究的配置中,检验大小均小于 0.006。当 X*与结果相关时,面积检验的功效远低于似然比和 Wald 检验。

总结

当存在现有临床预测因子时,评估新预测因子的统计显著性最适合在回归模型的背景下进行。尽管比较 AUC 是似然比和 Wald 检验的概念上等效方法,但它具有较差的统计性质。两种方法的使用通常会导致不一致的结论。尽管如此,比较接收器工作特性曲线仍然是评估新预测因子是否具有临床相关性的有用描述性工具。

相似文献

3
Comparing ROC curves derived from regression models.比较回归模型得出的 ROC 曲线。
Stat Med. 2013 Apr 30;32(9):1483-93. doi: 10.1002/sim.5648. Epub 2012 Oct 3.
4
Misuse of DeLong test to compare AUCs for nested models.误用 Delong 检验比较嵌套模型的 AUC。
Stat Med. 2012 Oct 15;31(23):2577-87. doi: 10.1002/sim.5328. Epub 2012 Mar 13.
10
Comparison of Paired ROC Curves through a Two-Stage Test.通过两阶段检验比较配对ROC曲线
J Biopharm Stat. 2015;25(5):881-902. doi: 10.1080/10543406.2014.920874. Epub 2014 Jun 6.

引用本文的文献

本文引用的文献

3
Using relative utility curves to evaluate risk prediction.使用相对效用曲线评估风险预测。
J R Stat Soc Ser A Stat Soc. 2009 Oct 1;172(4):729-748. doi: 10.1111/j.1467-985X.2009.00592.x.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验