Suppr超能文献

医疗保健中的多模态大型语言模型:应用、挑战和未来展望。

Multimodal Large Language Models in Health Care: Applications, Challenges, and Future Outlook.

机构信息

Weill Cornell Medicine-Qatar, Education City, Doha, Qatar.

Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar.

出版信息

J Med Internet Res. 2024 Sep 25;26:e59505. doi: 10.2196/59505.

Abstract

In the complex and multidimensional field of medicine, multimodal data are prevalent and crucial for informed clinical decisions. Multimodal data span a broad spectrum of data types, including medical images (eg, MRI and CT scans), time-series data (eg, sensor data from wearable devices and electronic health records), audio recordings (eg, heart and respiratory sounds and patient interviews), text (eg, clinical notes and research articles), videos (eg, surgical procedures), and omics data (eg, genomics and proteomics). While advancements in large language models (LLMs) have enabled new applications for knowledge retrieval and processing in the medical field, most LLMs remain limited to processing unimodal data, typically text-based content, and often overlook the importance of integrating the diverse data modalities encountered in clinical practice. This paper aims to present a detailed, practical, and solution-oriented perspective on the use of multimodal LLMs (M-LLMs) in the medical field. Our investigation spanned M-LLM foundational principles, current and potential applications, technical and ethical challenges, and future research directions. By connecting these elements, we aimed to provide a comprehensive framework that links diverse aspects of M-LLMs, offering a unified vision for their future in health care. This approach aims to guide both future research and practical implementations of M-LLMs in health care, positioning them as a paradigm shift toward integrated, multimodal data-driven medical practice. We anticipate that this work will spark further discussion and inspire the development of innovative approaches in the next generation of medical M-LLM systems.

摘要

在复杂多维的医学领域,多模态数据普遍存在且对于明智的临床决策至关重要。多模态数据涵盖了广泛的数据类型,包括医学图像(例如 MRI 和 CT 扫描)、时间序列数据(例如可穿戴设备和电子健康记录中的传感器数据)、音频记录(例如心脏和呼吸声音以及患者访谈)、文本(例如临床记录和研究文章)、视频(例如手术过程)和组学数据(例如基因组学和蛋白质组学)。尽管大型语言模型 (LLM) 的进步使得医学领域的知识检索和处理有了新的应用,但大多数 LLM 仍然限于处理单模态数据,通常是基于文本的内容,并且经常忽略了整合临床实践中遇到的各种数据模态的重要性。本文旨在提供一个详细、实用且面向解决方案的视角,探讨多模态大型语言模型 (M-LLM) 在医学领域的应用。我们的调查涵盖了 M-LLM 的基础原理、当前和潜在的应用、技术和伦理挑战以及未来的研究方向。通过连接这些元素,我们旨在提供一个全面的框架,将 M-LLM 的各个方面联系起来,为它们在医疗保健中的未来提供一个统一的愿景。这种方法旨在指导医疗保健中 M-LLM 的未来研究和实际应用,将其定位为一种向集成的、多模态数据驱动的医疗实践转变的范例。我们预计,这项工作将引发进一步的讨论,并激发下一代医学 M-LLM 系统中创新方法的发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe6c/11464944/9d2ffc4a6dab/jmir_v26i1e59505_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验