• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估数值数据预测模型的准确性:不是r也不是r2,为什么不是?那是什么?

Assessing the accuracy of predictive models for numerical data: Not r nor r2, why not? Then what?

作者信息

Li Jin

机构信息

National Earth and Marine Observations, Environmental Geoscience Division, Geoscience Australia, Canberra, Australian Capital Territory, Australia.

出版信息

PLoS One. 2017 Aug 24;12(8):e0183250. doi: 10.1371/journal.pone.0183250. eCollection 2017.

DOI:10.1371/journal.pone.0183250
PMID:28837692
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5570302/
Abstract

Assessing the accuracy of predictive models is critical because predictive models have been increasingly used across various disciplines and predictive accuracy determines the quality of resultant predictions. Pearson product-moment correlation coefficient (r) and the coefficient of determination (r2) are among the most widely used measures for assessing predictive models for numerical data, although they are argued to be biased, insufficient and misleading. In this study, geometrical graphs were used to illustrate what were used in the calculation of r and r2 and simulations were used to demonstrate the behaviour of r and r2 and to compare three accuracy measures under various scenarios. Relevant confusions about r and r2, has been clarified. The calculation of r and r2 is not based on the differences between the predicted and observed values. The existing error measures suffer various limitations and are unable to tell the accuracy. Variance explained by predictive models based on cross-validation (VEcv) is free of these limitations and is a reliable accuracy measure. Legates and McCabe's efficiency (E1) is also an alternative accuracy measure. The r and r2 do not measure the accuracy and are incorrect accuracy measures. The existing error measures suffer limitations. VEcv and E1 are recommended for assessing the accuracy. The applications of these accuracy measures would encourage accuracy-improved predictive models to be developed to generate predictions for evidence-informed decision-making.

摘要

评估预测模型的准确性至关重要,因为预测模型已在各个学科中越来越多地使用,且预测准确性决定了所得预测结果的质量。皮尔逊积矩相关系数(r)和决定系数(r2)是评估数值数据预测模型最广泛使用的指标之一,尽管有人认为它们存在偏差、不够充分且具有误导性。在本研究中,使用几何图形来说明r和r2计算中所使用的内容,并通过模拟来展示r和r2的特性,并在各种情况下比较三种准确性指标。关于r和r2的相关困惑已得到澄清。r和r2的计算并非基于预测值与观测值之间的差异。现有的误差指标存在各种局限性,无法说明准确性。基于交叉验证的预测模型解释方差(VEcv)没有这些局限性,是一种可靠的准确性指标。莱盖茨和麦凯布效率(E1)也是一种替代的准确性指标。r和r2并不能衡量准确性,是不正确的准确性指标。现有的误差指标存在局限性。建议使用VEcv和E1来评估准确性。这些准确性指标的应用将鼓励开发提高准确性的预测模型,以生成用于循证决策的预测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e6b/5570302/ad2cdeea8d79/pone.0183250.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e6b/5570302/046c635a0c25/pone.0183250.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e6b/5570302/a20364c437b9/pone.0183250.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e6b/5570302/1c9c7439afe1/pone.0183250.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e6b/5570302/ad2cdeea8d79/pone.0183250.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e6b/5570302/046c635a0c25/pone.0183250.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e6b/5570302/a20364c437b9/pone.0183250.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e6b/5570302/1c9c7439afe1/pone.0183250.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e6b/5570302/ad2cdeea8d79/pone.0183250.g004.jpg

相似文献

1
Assessing the accuracy of predictive models for numerical data: Not r nor r2, why not? Then what?评估数值数据预测模型的准确性:不是r也不是r2,为什么不是?那是什么?
PLoS One. 2017 Aug 24;12(8):e0183250. doi: 10.1371/journal.pone.0183250. eCollection 2017.
2
An update of the predicted lean yield equation for the Destron PG-100 optical grading probe.更新 Destron PG-100 光学分级探头的预测瘦肉产量方程。
J Anim Sci. 2023 Jan 3;101. doi: 10.1093/jas/skad199.
3
Utility of the coefficient of determination (r2) in assessing the accuracy of interspecies allometric predictions: illumination or illusion?决定系数(r2)在评估种间异速生长预测准确性中的效用:启示还是错觉?
Drug Metab Dispos. 2007 Dec;35(12):2139-42. doi: 10.1124/dmd.107.016444. Epub 2007 Aug 30.
4
Validation of economic and health outcomes simulation model of type 2 diabetes mellitus (ECHO-T2DM).2 型糖尿病经济和健康结局模拟模型的验证(ECHO-T2DM)。
J Med Econ. 2013 Aug;16(8):1007-21. doi: 10.3111/13696998.2013.809352. Epub 2013 Jun 26.
5
A comparative study of quantitative structure-activity relationship methods based on gallic acid derivatives.基于没食子酸衍生物的定量构效关系方法的比较研究。
SAR QSAR Environ Res. 2004 Apr;15(2):83-99. doi: 10.1080/10629360410001665875.
6
[Research on development and experiment of NIR wheat quality quick detection system].[近红外小麦品质快速检测系统的研制与试验研究]
Guang Pu Xue Yu Guang Pu Fen Xi. 2013 Jan;33(1):92-7.
7
Assessment and statistical modeling of the relationship between remotely sensed aerosol optical depth and PM2.5 in the eastern United States.美国东部地区遥感气溶胶光学厚度与PM2.5之间关系的评估及统计建模
Res Rep Health Eff Inst. 2012 May(167):5-83; discussion 85-91.
8
Applicability Domain Dependent Predictive Uncertainty in QSAR Regressions.定量构效关系(QSAR)回归中与适用域相关的预测不确定性
Mol Inform. 2014 Jan;33(1):26-35. doi: 10.1002/minf.201200131. Epub 2013 Oct 7.
9
A solution to minimum sample size for regressions.回归分析的最小样本量解决方案。
PLoS One. 2020 Feb 21;15(2):e0229345. doi: 10.1371/journal.pone.0229345. eCollection 2020.
10
Avoiding and identifying errors in health technology assessment models: qualitative study and methodological review.避免和识别健康技术评估模型中的错误:定性研究和方法学综述。
Health Technol Assess. 2010 May;14(25):iii-iv, ix-xii, 1-107. doi: 10.3310/hta14250.

引用本文的文献

1
Machine Learning in Nursing: A Cross-Disciplinary Review.护理中的机器学习:跨学科综述
Cureus. 2025 Jul 2;17(7):e87181. doi: 10.7759/cureus.87181. eCollection 2025 Jul.
2
Evaluating the relative predictive validity of measures of self-referential processing for depressive symptom severity.评估自我参照加工测量对于抑郁症状严重程度的相对预测效度。
Front Psychiatry. 2025 Feb 10;15:1463116. doi: 10.3389/fpsyt.2024.1463116. eCollection 2024.
3
The global distribution and drivers of wood density and their impact on forest carbon stocks.

本文引用的文献

1
Predicting assemblages and species richness of endemic fish in the upper Yangtze River.预测长江上游特有鱼类的组合和物种丰富度。
Sci Total Environ. 2010 Sep 1;408(19):4211-20. doi: 10.1016/j.scitotenv.2010.04.052. Epub 2010 Jun 11.
2
Predictive accuracy and explained variation.预测准确性和解释变异
Stat Med. 2003 Jul 30;22(14):2299-308. doi: 10.1002/sim.1486.
木材密度的全球分布、驱动因素及其对森林碳储量的影响。
Nat Ecol Evol. 2024 Dec;8(12):2195-2212. doi: 10.1038/s41559-024-02564-9. Epub 2024 Oct 15.
4
The Structure of Simple Satellite Variation in the Human Genome and Its Correlation With Centromere Ancestry.人类基因组中简单卫星变异的结构及其与着丝粒祖先的关系。
Genome Biol Evol. 2024 Aug 5;16(8). doi: 10.1093/gbe/evae153.
5
The global biogeography of tree leaf form and habit.树木叶片形态和习性的全球生物地理学。
Nat Plants. 2023 Nov;9(11):1795-1809. doi: 10.1038/s41477-023-01543-5. Epub 2023 Oct 23.
6
Native diversity buffers against severity of non-native tree invasions.本地物种多样性缓冲了非本地树种入侵的严重程度。
Nature. 2023 Sep;621(7980):773-781. doi: 10.1038/s41586-023-06440-7. Epub 2023 Aug 23.
7
Testing the "RCT augmentation" methodology: A trial simulation study to guide the broadening of trials eligibility criteria and inform on effectiveness.测试“随机对照试验增强”方法:一项试验模拟研究,以指导扩大试验纳入标准并提供有效性信息。
Contemp Clin Trials Commun. 2023 Apr 14;33:101142. doi: 10.1016/j.conctc.2023.101142. eCollection 2023 Jun.
8
Time-series NARX feedback neural network for forecasting impedance cardiography ICG missing points: a predictive model.用于预测阻抗心动图(ICG)缺失点的时间序列NARX反馈神经网络:一种预测模型。
Front Physiol. 2023 Jun 6;14:1181745. doi: 10.3389/fphys.2023.1181745. eCollection 2023.
9
An update of the predicted lean yield equation for the Destron PG-100 optical grading probe.更新 Destron PG-100 光学分级探头的预测瘦肉产量方程。
J Anim Sci. 2023 Jan 3;101. doi: 10.1093/jas/skad199.
10
Machine-learning-based prediction of the effectiveness of the delivered dose by exhale-gated radiotherapy for locally advanced lung cancer: The additional value of geometric over dosimetric parameters alone.基于机器学习预测呼气门控放疗对局部晚期肺癌的给药剂量有效性:几何参数相对于仅剂量学参数的附加价值。
Front Oncol. 2023 Jan 13;12:870432. doi: 10.3389/fonc.2022.870432. eCollection 2022.