Suppr超能文献

快速综述:多模态大语言模型在医疗保健领域的使用日益增加。

Rapid review: Growing usage of Multimodal Large Language Models in healthcare.

作者信息

Gupta Pallavi, Zhang Zhihong, Song Meijia, Michalowski Martin, Hu Xiao, Stiglic Gregor, Topaz Maxim

机构信息

Columbia University, School of Nursing, NY, United States.

Columbia University, School of Nursing, NY, United States; Columbia University, Data Science Institute, NY, United States.

出版信息

J Biomed Inform. 2025 Aug 1:104875. doi: 10.1016/j.jbi.2025.104875.

Abstract

OBJECTIVE

Recent advancements in large language models (LLMs) have led to multimodal LLMs (MLLMs), which integrate multiple data modalities beyond text. Although MLLMs show promise, there is a gap in the literature that empirically demonstrates their impact in healthcare. This paper summarizes the applications of MLLMs in healthcare, highlighting their potential to transform health practices.

METHODS

A rapid literature review was conducted in August 2024 using World Health Organization (WHO) rapid-review methodology and PRISMA standards, with searches across four databases (Scopus, Medline, PubMed and ACM Digital Library) and top-tier conferences-including NeurIPS, ICML, AAAI, MICCAI, CVPR, ACL and EMNLP. Articles on MLLMs healthcare applications were included for analysis based on inclusion and exclusion criteria.

RESULTS

The search yielded 115 articles, 39 included in the final analysis. Of these, 77% appeared online (preprints and published) in 2024, reflecting the emergence of MLLMs. 80% of studies were from Asia and North America (mainly China and US), with Europe lagging. Studies split evenly between pre-built MLLMs evaluations (60% focused on GPT versions) and custom MLLMs/frameworks development with task-specific customizations. About 81% of studies examined MLLMs for diagnosis and reporting in radiology, pathology, and ophthalmology, with additional applications in education, surgery, and mental health. Prompting strategies, used in 80% of studies, improved performance in nearly half. However, evaluation practices were inconsistent with 67% reported accuracy. Error analysis was mostly anecdotal, with only 18% categorized failure types. Only 13% validated explainability through clinician feedback. Clinical deployment was demonstrated in just 3% of studies, and workflow integration, governance, and safety were rarely addressed.

DISCUSSION AND CONCLUSION

MLLMs offer substantial potential for healthcare transformation through multimodal data integration. Yet, methodological inconsistencies, limited validation, and underdeveloped deployment strategies highlight the need for standardized evaluation metrics, structured error analysis, and human-centered design to support safe, scalable, and trustworthy clinical adoption.

摘要

目的

大语言模型(LLMs)的最新进展催生了多模态大语言模型(MLLMs),后者整合了文本之外的多种数据模态。尽管MLLMs展现出了潜力,但文献中仍存在空白,缺乏对其在医疗保健领域影响的实证证明。本文总结了MLLMs在医疗保健中的应用,突出了它们变革医疗实践的潜力。

方法

2024年8月采用世界卫生组织(WHO)的快速综述方法和PRISMA标准进行了快速文献综述,检索了四个数据库(Scopus、Medline、PubMed和ACM数字图书馆)以及顶级会议,包括神经信息处理系统大会(NeurIPS)、国际机器学习会议(ICML)、美国人工智能协会年会(AAAI)、医学图像计算与计算机辅助干预国际会议(MICCAI)、计算机视觉与模式识别会议(CVPR)、计算语言学协会年会(ACL)和自然语言处理经验方法会议(EMNLP)。根据纳入和排除标准,纳入关于MLLMs医疗保健应用的文章进行分析。

结果

检索得到115篇文章,最终分析纳入39篇。其中,77%于2024年在线发表(预印本和已发表文章),反映了MLLMs的出现。80%的研究来自亚洲和北美(主要是中国和美国),欧洲相对滞后。研究在预构建MLLMs评估(60%聚焦于GPT版本)和针对特定任务定制的自定义MLLMs/框架开发之间平均分配。约81%的研究考察了MLLMs在放射学、病理学和眼科的诊断及报告方面的应用,在教育、手术和心理健康方面也有其他应用。80%的研究使用了提示策略,近一半研究中性能得到提升。然而,评估实践不一致,67%报告了准确率。错误分析大多是轶事性的,只有18%对失败类型进行了分类。只有13%通过临床医生反馈验证了可解释性。仅3%的研究展示了临床部署,工作流程整合、治理和安全性很少被提及。

讨论与结论

MLLMs通过多模态数据整合为医疗保健变革提供了巨大潜力。然而,方法上的不一致、有限的验证以及不发达的部署策略凸显了对标准化评估指标、结构化错误分析和以人为主的设计的需求,以支持安全、可扩展且值得信赖的临床应用。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验