揭示机器学习驱动的科学中的过度乐观和发表偏倚。

Unraveling overoptimism and publication bias in ML-driven science.

作者信息

Saidi Pouria, Dasarathy Gautam, Berisha Visar

机构信息

School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85281, USA.

College of Health Solutions, Arizona State University, Tempe, AZ 85281, USA.

出版信息

Patterns (N Y). 2025 Feb 25;6(4):101185. doi: 10.1016/j.patter.2025.101185. eCollection 2025 Apr 11.

DOI:10.1016/j.patter.2025.101185

PMID:40264959

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12010447/

Abstract

Machine learning (ML) is increasingly used across many disciplines with impressive reported results. However, recent studies suggest that the published performances of ML models are often overoptimistic. Validity concerns are underscored by findings of an inverse relationship between sample size and reported accuracy in published ML models, contrasting with the theory of learning curves where accuracy should improve or remain stable with increasing sample size. This paper investigates factors contributing to overoptimism in ML-driven science, focusing on overfitting and publication bias. We introduce a stochastic model for observed accuracy, integrating parametric learning curves and the aforementioned biases. We construct an estimator that corrects for these biases in observed data. Theoretical and empirical results show that our framework can estimate the underlying learning curve, providing realistic performance assessments from published results. By applying the model to meta-analyses of classifications of neurological conditions, we estimate the inherent limits of ML-driven prediction in each domain.

摘要

机器学习（ML）在许多学科中越来越多地被使用，报告的结果令人印象深刻。然而，最近的研究表明，ML模型已发表的性能往往过于乐观。样本量与已发表的ML模型中报告的准确率之间存在反比关系，这一发现凸显了有效性问题，这与学习曲线理论形成对比，在学习曲线理论中，准确率应随着样本量的增加而提高或保持稳定。本文研究了导致ML驱动的科学中过度乐观的因素，重点关注过拟合和发表偏倚。我们引入了一个用于观察到的准确率的随机模型，整合了参数学习曲线和上述偏差。我们构建了一个估计器，用于校正观察数据中的这些偏差。理论和实证结果表明，我们的框架可以估计潜在的学习曲线，根据已发表的结果提供现实的性能评估。通过将该模型应用于神经疾病分类的荟萃分析，我们估计了每个领域中ML驱动预测的固有局限性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6043/12010447/c6d23c4fa5cf/gr1.jpg

相似文献

Unraveling overoptimism and publication bias in ML-driven science.

Patterns (N Y). 2025 Feb 25;6(4):101185. doi: 10.1016/j.patter.2025.101185. eCollection 2025 Apr 11.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Overoptimism in cross-validation when using partial least squares-discriminant analysis for omics data: a systematic study.

Anal Bioanal Chem. 2018 Sep;410(23):5981-5992. doi: 10.1007/s00216-018-1217-1. Epub 2018 Jun 29.

Computer programs to estimate overoptimism in measures of discrimination for predicting the risk of cardiovascular diseases.

J Eval Clin Pract. 2013 Apr;19(2):358-62. doi: 10.1111/j.1365-2753.2012.01834.x. Epub 2012 Mar 12.

Publication bias impacts on effect size, statistical power, and magnitude (Type M) and sign (Type S) errors in ecology and evolutionary biology.

BMC Biol. 2023 Apr 3;21(1):71. doi: 10.1186/s12915-022-01485-y.

Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review.

BMJ. 2021 Oct 20;375:n2281. doi: 10.1136/bmj.n2281.

Penalized maximum likelihood estimation to directly adjust diagnostic and prognostic prediction models for overoptimism: a clinical example.

J Clin Epidemiol. 2004 Dec;57(12):1262-70. doi: 10.1016/j.jclinepi.2004.01.020.

Inherent Bias in Electronic Health Records: A Scoping Review of Sources of Bias.

medRxiv. 2024 Apr 12:2024.04.09.24305594. doi: 10.1101/2024.04.09.24305594.

Reconciling modern machine-learning practice and the classical bias-variance trade-off.

Proc Natl Acad Sci U S A. 2019 Aug 6;116(32):15849-15854. doi: 10.1073/pnas.1903070116. Epub 2019 Jul 24.

COVID-Net Biochem: an explainability-driven framework to building machine learning models for predicting survival and kidney injury of COVID-19 patients from clinical and biochemistry data.

Sci Rep. 2023 Oct 9;13(1):17001. doi: 10.1038/s41598-023-42203-0.

本文引用的文献

REFORMS: Consensus-based Recommendations for Machine-learning-based Science.

Sci Adv. 2024 May 3;10(18):eadk3452. doi: 10.1126/sciadv.adk3452. Epub 2024 May 1.

Reliability and validity of a widely-available AI tool for assessment of stress based on speech.

Sci Rep. 2023 Nov 18;13(1):20224. doi: 10.1038/s41598-023-47153-1.

Leakage and the reproducibility crisis in machine-learning-based science.

Patterns (N Y). 2023 Aug 4;4(9):100804. doi: 10.1016/j.patter.2023.100804. eCollection 2023 Sep 8.

The Shape of Learning Curves: A Review.

IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):7799-7819. doi: 10.1109/TPAMI.2022.3220744. Epub 2023 May 5.

Machine Learning Model Validation for Early Stage Studies with Small Sample Sizes.

Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:2314-2319. doi: 10.1109/EMBC46164.2021.9629697.

Digital medicine and the curse of dimensionality.

NPJ Digit Med. 2021 Oct 28;4(1):153. doi: 10.1038/s41746-021-00521-5.

Nonreplicable publications are cited more than replicable ones.

Sci Adv. 2021 May 21;7(21). doi: 10.1126/sciadv.abd1705. Print 2021 May.

Ten Years of Research on Automatic Voice and Speech Analysis of People With Alzheimer's Disease and Mild Cognitive Impairment: A Systematic Review Article.

Front Psychol. 2021 Mar 23;12:620251. doi: 10.3389/fpsyg.2021.620251. eCollection 2021.

Overly optimistic prediction results on imbalanced data: a case study of flaws and benefits when applying over-sampling.

Artif Intell Med. 2021 Jan;111:101987. doi: 10.1016/j.artmed.2020.101987. Epub 2020 Nov 20.

Artificial Intelligence, Speech, and Language Processing Approaches to Monitoring Alzheimer's Disease: A Systematic Review.

J Alzheimers Dis. 2020;78(4):1547-1574. doi: 10.3233/JAD-200888.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

揭示机器学习驱动的科学中的过度乐观和发表偏倚。

Unraveling overoptimism and publication bias in ML-driven science.

作者信息

Saidi Pouria, Dasarathy Gautam, Berisha Visar

机构信息

School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85281, USA.

College of Health Solutions, Arizona State University, Tempe, AZ 85281, USA.

出版信息

Patterns (N Y). 2025 Feb 25;6(4):101185. doi: 10.1016/j.patter.2025.101185. eCollection 2025 Apr 11.

DOI:10.1016/j.patter.2025.101185

PMID:40264959

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12010447/

Abstract

摘要

揭示机器学习驱动的科学中的过度乐观和发表偏倚。

Unraveling overoptimism and publication bias in ML-driven science.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

揭示机器学习驱动的科学中的过度乐观和发表偏倚。

Unraveling overoptimism and publication bias in ML-driven science.

作者信息

机构信息

出版信息

相似文献

本文引用的文献