Kerr Kathleen F, Janes Holly
Department of Biostatistics, University of Washington, Box 357232, Seattle, WA, 98115, U.S.A.
Fred Hutchinson Cancer Research Center, Vaccine and Infectious Disease and Public Health Sciences Divisions, 1100 Fairview Ave N M2 C200, Seattle, WA, 98109, U.S.A.
Stat Med. 2017 Dec 10;36(28):4503-4508. doi: 10.1002/sim.7341.
Developing new measures of risk model performance is an active line of research, often motivated by the conventional wisdom that area under the ROC curve is an 'insensitive' measure of the additional predictive capacity offered by new biomarkers. Without endorsing area under the ROC curve, we argue that this charge is not substantiated. Three articles in this issue discuss alternative metrics of risk model performance: NRI(p) (two-category net reclassification index at the event rate), integrated discrimination index, and R-squared statistics. Guided by the principle that performance metrics should match the intended use of a risk prediction model, we argue that routine use of these indices is not justified. Instead, we recommend decision-theoretic measures to evaluate risk prediction models for applications in which clinically relevant risk thresholds have been established for classifying individuals. In the absence of established risk thresholds, additional research is needed to develop suitable metrics. Copyright © 2017 John Wiley & Sons, Ltd.
开发风险模型性能的新度量方法是一个活跃的研究领域,其动机通常源于一种传统观念,即ROC曲线下面积是对新生物标志物所提供的额外预测能力的一种“不敏感”度量。在不认可ROC曲线下面积的情况下,我们认为这种指责是没有根据的。本期的三篇文章讨论了风险模型性能的替代指标:NRI(p)(事件发生率下的两类净重新分类指数)、综合判别指数和R平方统计量。基于性能指标应与风险预测模型的预期用途相匹配的原则,我们认为常规使用这些指数是不合理的。相反,我们建议采用决策理论方法来评估风险预测模型,以用于已为个体分类建立了临床相关风险阈值的应用场景。在没有既定风险阈值的情况下,需要进行更多研究以开发合适的指标。版权所有© 2017约翰威立父子有限公司。