Suppr超能文献

机器学习模型对危急或恶化的健康状况反应迟钝。

Low responsiveness of machine learning models to critical or deteriorating health conditions.

作者信息

Pias Tanmoy Sarkar, Afrose Sharmin, Tuli Moon Das, Trisha Ipsita Hamid, Deng Xinwei, Nemeroff Charles B, Yao Danfeng Daphne

机构信息

Department of Computer Science and Sanghani Center for AI and Data Analytics, Virginia Tech, Blacksburg, VA, USA.

Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA.

出版信息

Commun Med (Lond). 2025 Mar 11;5(1):62. doi: 10.1038/s43856-025-00775-0.

Abstract

BACKGROUND

Machine learning (ML) based mortality prediction models can be immensely useful in intensive care units. Such a model should generate warnings to alert physicians when a patient's condition rapidly deteriorates, or their vitals are in highly abnormal ranges. Before clinical deployment, it is important to comprehensively assess a model's ability to recognize critical patient conditions.

METHODS

We develop multiple medical ML testing approaches, including a gradient ascent method and neural activation map. We systematically assess these machine learning models' ability to respond to serious medical conditions using additional test cases, some of which are time series. Guided by medical doctors, our evaluation involves multiple machine learning models, resampling techniques, and four datasets for two clinical prediction tasks.

RESULTS

We identify serious deficiencies in the models' responsiveness, with the models being unable to recognize severely impaired medical conditions or rapidly deteriorating health. For in-hospital mortality prediction, the models tested using our synthesized cases fail to recognize 66% of the injuries. In some instances, the models fail to generate adequate mortality risk scores for all test cases. Our study identifies similar kinds of deficiencies in the responsiveness of 5-year breast and lung cancer prediction models.

CONCLUSIONS

Using generated test cases, we find that statistical machine-learning models trained solely from patient data are grossly insufficient and have many dangerous blind spots. Most of the ML models tested fail to respond adequately to critically ill patients. How to incorporate medical knowledge into clinical machine learning models is an important future research direction.

摘要

背景

基于机器学习(ML)的死亡率预测模型在重症监护病房可能非常有用。当患者病情迅速恶化或生命体征处于高度异常范围时,这样的模型应该发出警报以提醒医生。在临床应用之前,全面评估模型识别危急患者状况的能力非常重要。

方法

我们开发了多种医学ML测试方法,包括梯度上升法和神经激活图。我们使用额外的测试案例系统地评估这些机器学习模型对严重医疗状况的反应能力,其中一些是时间序列。在医生的指导下,我们的评估涉及多个机器学习模型、重采样技术以及用于两个临床预测任务的四个数据集。

结果

我们发现模型的反应能力存在严重缺陷,这些模型无法识别严重受损的医疗状况或迅速恶化的健康状况。对于院内死亡率预测,使用我们合成案例测试的模型未能识别66%的损伤情况。在某些情况下,模型未能为所有测试案例生成足够的死亡风险评分。我们的研究在5年乳腺癌和肺癌预测模型的反应能力方面也发现了类似的缺陷。

结论

通过生成测试案例,我们发现仅从患者数据训练的统计机器学习模型严重不足,并且有许多危险的盲点。大多数测试的ML模型对重症患者的反应不足。如何将医学知识纳入临床机器学习模型是未来一个重要的研究方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47ce/11897252/3dfaffa5e0a7/43856_2025_775_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验