• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估机器学习预测的逐点可靠性。

Evaluating pointwise reliability of machine learning prediction.

机构信息

Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy.

Department of Medical Informatics, Amsterdam UMC, University of Amsterdam, the Netherlands.

出版信息

J Biomed Inform. 2022 Mar;127:103996. doi: 10.1016/j.jbi.2022.103996. Epub 2022 Jan 15.

DOI:10.1016/j.jbi.2022.103996
PMID:35041981
Abstract

Interest in Machine Learning applications to tackle clinical and biological problems is increasing. This is driven by promising results reported in many research papers, the increasing number of AI-based software products, and by the general interest in Artificial Intelligence to solve complex problems. It is therefore of importance to improve the quality of machine learning output and add safeguards to support their adoption. In addition to regulatory and logistical strategies, a crucial aspect is to detect when a Machine Learning model is not able to generalize to new unseen instances, which may originate from a population distant to that of the training population or from an under-represented subpopulation. As a result, the prediction of the machine learning model for these instances may be often wrong, given that the model is applied outside its "reliable" space of work, leading to a decreasing trust of the final users, such as clinicians. For this reason, when a model is deployed in practice, it would be important to advise users when the model's predictions may be unreliable, especially in high-stakes applications, including those in healthcare. Yet, reliability assessment of each machine learning prediction is still poorly addressed. Here, we review approaches that can support the identification of unreliable predictions, we harmonize the notation and terminology of relevant concepts, and we highlight and extend possible interrelationships and overlap among concepts. We then demonstrate, on simulated and real data for ICU in-hospital death prediction, a possible integrative framework for the identification of reliable and unreliable predictions. To do so, our proposed approach implements two complementary principles, namely the density principle and the local fit principle. The density principle verifies that the instance we want to evaluate is similar to the training set. The local fit principle verifies that the trained model performs well on training subsets that are more similar to the instance under evaluation. Our work can contribute to consolidating work in machine learning especially in medicine.

摘要

人们对应用机器学习解决临床和生物学问题的兴趣日益浓厚。这是因为许多研究论文报告了有前景的结果,越来越多的人工智能软件产品,以及人们普遍对人工智能解决复杂问题的兴趣。因此,提高机器学习输出的质量并增加保障措施以支持其采用非常重要。除了监管和后勤策略外,一个关键方面是检测机器学习模型何时无法推广到新的未见实例,这些实例可能源自与训练人群不同的人群,也可能源自代表性不足的亚人群。因此,由于模型应用于其“可靠”工作范围之外,机器学习模型对这些实例的预测通常可能是错误的,从而导致最终用户(例如临床医生)对其信任度降低。因此,当模型在实际中部署时,重要的是在模型的预测可能不可靠时通知用户,特别是在高风险应用中,包括医疗保健。然而,每个机器学习预测的可靠性评估仍然没有得到很好的解决。在这里,我们回顾了支持识别不可靠预测的方法,协调了相关概念的符号和术语,并强调和扩展了概念之间可能的相互关系和重叠。然后,我们在 ICU 住院内死亡预测的模拟和真实数据上演示了一种用于识别可靠和不可靠预测的可能综合框架。为此,我们提出的方法实现了两个互补的原则,即密度原则和局部拟合原则。密度原则验证我们要评估的实例与训练集相似。局部拟合原则验证训练模型在与评估实例更相似的训练子集中表现良好。我们的工作可以有助于巩固机器学习领域的工作,特别是在医学领域。

相似文献

1
Evaluating pointwise reliability of machine learning prediction.评估机器学习预测的逐点可靠性。
J Biomed Inform. 2022 Mar;127:103996. doi: 10.1016/j.jbi.2022.103996. Epub 2022 Jan 15.
2
A Reliable Machine Learning Approach applied to Single-Cell Classification in Acute Myeloid Leukemia.一种应用于急性髓系白血病单细胞分类的可靠机器学习方法。
AMIA Annu Symp Proc. 2021 Jan 25;2020:925-932. eCollection 2020.
3
Why did AI get this one wrong? - Tree-based explanations of machine learning model predictions.为什么 AI 会犯这个错误?——机器学习模型预测的基于树的解释。
Artif Intell Med. 2023 Jan;135:102471. doi: 10.1016/j.artmed.2022.102471. Epub 2022 Dec 1.
4
Data-driven modeling and prediction of blood glucose dynamics: Machine learning applications in type 1 diabetes.基于数据驱动的血糖动力学建模与预测:机器学习在 1 型糖尿病中的应用。
Artif Intell Med. 2019 Jul;98:109-134. doi: 10.1016/j.artmed.2019.07.007. Epub 2019 Jul 26.
5
Data-driven evolution of water quality models: An in-depth investigation of innovative outlier detection approaches-A case study of Irish Water Quality Index (IEWQI) model.水质模型的数据驱动演变:创新异常值检测方法的深入研究——以爱尔兰水质指数(IEWQI)模型为例
Water Res. 2024 May 15;255:121499. doi: 10.1016/j.watres.2024.121499. Epub 2024 Mar 20.
6
A qualitative research framework for the design of user-centered displays of explanations for machine learning model predictions in healthcare.面向医疗保健中机器学习模型预测解释的以用户为中心的显示设计的定性研究框架。
BMC Med Inform Decis Mak. 2020 Oct 8;20(1):257. doi: 10.1186/s12911-020-01276-x.
7
The future of Cochrane Neonatal.考克兰新生儿协作网的未来。
Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.
8
Prediction of In-hospital Mortality in Emergency Department Patients With Sepsis: A Local Big Data-Driven, Machine Learning Approach.急诊科脓毒症患者院内死亡率的预测:一种基于本地大数据驱动的机器学习方法。
Acad Emerg Med. 2016 Mar;23(3):269-78. doi: 10.1111/acem.12876. Epub 2016 Feb 13.
9
Improving the Applicability of AI for Psychiatric Applications through Human-in-the-loop Methodologies.通过人在环方法提高人工智能在精神科应用中的适用性。
Schizophr Bull. 2022 Sep 1;48(5):949-957. doi: 10.1093/schbul/sbac038.
10
Supervised machine learning tools: a tutorial for clinicians.监督机器学习工具:临床医生教程。
J Neural Eng. 2020 Nov 19;17(6). doi: 10.1088/1741-2552/abbff2.

引用本文的文献

1
Survival prediction modelling in patients with acute ST-segment elevation myocardial infarction with LASSO regression and explainable machine learning.使用套索回归和可解释机器学习对急性ST段抬高型心肌梗死患者进行生存预测建模
Front Med (Lausanne). 2025 Jul 18;12:1594273. doi: 10.3389/fmed.2025.1594273. eCollection 2025.
2
A Novel Framework to Assess Clinical Information in Digital Health Technologies: Cross-Sectional Survey Study.一种评估数字健康技术中临床信息的新框架:横断面调查研究。
JMIR Med Inform. 2025 May 30;13:e58125. doi: 10.2196/58125.
3
Machine learning models for predicting survival in lung cancer patients undergoing microwave ablation.
用于预测接受微波消融治疗的肺癌患者生存率的机器学习模型。
Front Med (Lausanne). 2025 May 7;12:1561083. doi: 10.3389/fmed.2025.1561083. eCollection 2025.
4
Systematic review of AI/ML applications in multi-domain robotic rehabilitation: trends, gaps, and future directions.多领域机器人康复中人工智能/机器学习应用的系统综述:趋势、差距与未来方向
J Neuroeng Rehabil. 2025 Apr 9;22(1):79. doi: 10.1186/s12984-025-01605-z.
5
Consensus statement on the credibility assessment of machine learning predictors.关于机器学习预测器可信度评估的共识声明。
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf100.
6
How good is your synthetic data? SynthRO, a dashboard to evaluate and benchmark synthetic tabular data.你的合成数据有多好?SynthRO,一个用于评估和基准测试合成表格数据的仪表板。
BMC Med Inform Decis Mak. 2025 Feb 18;25(1):89. doi: 10.1186/s12911-024-02731-9.
7
Decoding Schizophrenia: How AI-Enhanced fMRI Unlocks New Pathways for Precision Psychiatry.解码精神分裂症:人工智能增强的功能磁共振成像如何为精准精神病学开启新途径。
Brain Sci. 2024 Nov 27;14(12):1196. doi: 10.3390/brainsci14121196.
8
Trustworthiness of a machine learning early warning model in medical and surgical inpatients.机器学习早期预警模型在内科和外科住院患者中的可信度。
JAMIA Open. 2025 Jan 6;8(1):ooae156. doi: 10.1093/jamiaopen/ooae156. eCollection 2025 Feb.
9
Differential diagnosis of pediatric cervical lymph node lesions based on simple clinical features.基于简单临床特征的小儿颈部淋巴结病变的鉴别诊断。
Eur J Pediatr. 2024 Nov;183(11):4929-4938. doi: 10.1007/s00431-024-05760-8. Epub 2024 Sep 17.
10
Trust me if you can: a survey on reliability and interpretability of machine learning approaches for drug sensitivity prediction in cancer.相信我:一项关于机器学习方法在癌症药物敏感性预测中的可靠性和可解释性的调查。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae379.