Suppr超能文献

从大语言模型到多模态人工智能:关于生成式人工智能在医学领域潜力的范围综述

From large language models to multimodal AI: a scoping review on the potential of generative AI in medicine.

作者信息

Buess Lukas, Keicher Matthias, Navab Nassir, Maier Andreas, Tayebi Arasteh Soroosh

机构信息

Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.

Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany.

出版信息

Biomed Eng Lett. 2025 Aug 22;15(5):845-863. doi: 10.1007/s13534-025-00497-1. eCollection 2025 Sep.

Abstract

UNLABELLED

Generative artificial intelligence (AI) models, such as diffusion models and OpenAI's ChatGPT, are transforming medicine by enhancing diagnostic accuracy and automating clinical workflows. The field has advanced rapidly, evolving from text-only large language models for tasks such as clinical documentation and decision support to multimodal AI systems capable of integrating diverse data modalities, including imaging, text, and structured data, within a single model. The diverse landscape of these technologies, along with rising interest, highlights the need for a comprehensive review of their applications and potential. This scoping review explores the evolution of multimodal AI, highlighting its methods, applications, datasets, and evaluation in clinical settings. Adhering to PRISMA-ScR guidelines, we systematically queried PubMed, IEEE Xplore, and Web of Science, prioritizing recent studies published up to the end of 2024. After rigorous screening, 145 papers were included, revealing key trends and challenges in this dynamic field. Our findings underscore a shift from unimodal to multimodal approaches, driving innovations in diagnostic support, medical report generation, drug discovery, and conversational AI. However, critical challenges remain, including the integration of heterogeneous data types, improving model interpretability, addressing ethical concerns, and validating AI systems in real-world clinical settings. This review summarizes the current state of the art, identifies critical gaps, and provides insights to guide the development of scalable, trustworthy, and clinically impactful multimodal AI solutions in healthcare.

SUPPLEMENTARY INFORMATION

The online version contains supplementary material available at 10.1007/s13534-025-00497-1.

摘要

未标注

生成式人工智能(AI)模型,如扩散模型和OpenAI的ChatGPT,正在通过提高诊断准确性和自动化临床工作流程来改变医学。该领域发展迅速,已从用于临床文档记录和决策支持等任务的纯文本大语言模型发展到能够在单个模型中集成包括影像、文本和结构化数据等多种数据模式的多模态AI系统。这些技术的多样化格局以及日益增长的关注度凸显了对其应用和潜力进行全面综述的必要性。本范围综述探讨了多模态AI的发展历程,突出了其在临床环境中的方法、应用、数据集和评估。遵循PRISMA-ScR指南,我们系统地检索了PubMed、IEEE Xplore和科学网,优先选取截至2024年底发表的近期研究。经过严格筛选,纳入了145篇论文,揭示了这个动态领域的关键趋势和挑战。我们的研究结果强调了从单模态方法向多模态方法的转变,推动了诊断支持、医学报告生成、药物发现和对话式AI方面的创新。然而,关键挑战依然存在,包括异构数据类型的整合、提高模型可解释性、解决伦理问题以及在真实临床环境中验证AI系统。本综述总结了当前的技术水平,识别了关键差距,并提供了见解,以指导在医疗保健领域开发可扩展、值得信赖且具有临床影响力的多模态AI解决方案。

补充信息

在线版本包含可在10.1007/s13534-025-00497-1获取的补充材料。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5247/12411359/760630c34222/13534_2025_497_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验