• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

医学影像中多模态基础模型的系统评价与实施指南

A Systematic Review and Implementation Guidelines of Multimodal Foundation Models in Medical Imaging.

作者信息

Huang Shih-Cheng, Jensen Malte, Yeung-Levy Serena, Lungren Matthew P, Poon Hoifung, Chaudhari Akshay S

机构信息

Stanford University.

Microsoft Research.

出版信息

Res Sq. 2025 Apr 28:rs.3.rs-5537908. doi: 10.21203/rs.3.rs-5537908/v1.

DOI:10.21203/rs.3.rs-5537908/v1
PMID:40343333
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12060978/
Abstract

Artificial Intelligence (AI) holds immense potential to transform healthcare, yet progress is often hindered by the reliance on large labeled datasets and unimodal data. Multimodal Foundation Models (FMs), particularly those leveraging Self-Supervised Learning (SSL) on multimodal data, offer a paradigm shift towards label-efficient, holistic patient modeling. However, the rapid emergence of these complex models has created a fragmented landscape. Here, we provide a systematic review of multimodal FMs for medical imaging applications. Through rigorous screening of 1,144 publications (2012-2024) and in-depth analysis of 48 studies, we establish a unified terminology and comprehensively assess the current state-of-the-art. Our review aggregates current knowledge, critically identifies key limitations and underexplored opportunities, and culminates in actionable guidelines for researchers, clinicians, developers, and policymakers. This work provides a crucial roadmap to navigate and accelerate the responsible development and clinical translation of next-generation multimodal AI in healthcare.

摘要

人工智能(AI)在变革医疗保健方面具有巨大潜力,但进展往往因依赖大型标注数据集和单峰数据而受阻。多模态基础模型(FMs),特别是那些对多模态数据利用自监督学习(SSL)的模型,为向标签高效、整体的患者建模提供了范式转变。然而,这些复杂模型的迅速出现造成了一片零散的局面。在此,我们对用于医学成像应用的多模态FMs进行了系统综述。通过对1144篇出版物(2012 - 2024年)进行严格筛选,并对48项研究进行深入分析,我们建立了统一的术语,并全面评估了当前的技术水平。我们的综述汇总了当前知识,批判性地确定了关键限制和未充分探索的机会,并最终为研究人员、临床医生、开发者和政策制定者制定了可操作的指南。这项工作为在医疗保健领域负责任地开发和临床转化下一代多模态人工智能提供了至关重要的路线图,以引导并加速其发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f1b/12060978/6ecff5ee565a/nihpp-rs5537908v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f1b/12060978/bf1bde364dbf/nihpp-rs5537908v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f1b/12060978/05c4982354c7/nihpp-rs5537908v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f1b/12060978/ae88e987387b/nihpp-rs5537908v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f1b/12060978/cfecab17dcdb/nihpp-rs5537908v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f1b/12060978/6ecff5ee565a/nihpp-rs5537908v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f1b/12060978/bf1bde364dbf/nihpp-rs5537908v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f1b/12060978/05c4982354c7/nihpp-rs5537908v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f1b/12060978/ae88e987387b/nihpp-rs5537908v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f1b/12060978/cfecab17dcdb/nihpp-rs5537908v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f1b/12060978/6ecff5ee565a/nihpp-rs5537908v1-f0005.jpg

相似文献

1
A Systematic Review and Implementation Guidelines of Multimodal Foundation Models in Medical Imaging.医学影像中多模态基础模型的系统评价与实施指南
Res Sq. 2025 Apr 28:rs.3.rs-5537908. doi: 10.21203/rs.3.rs-5537908/v1.
2
Artificial intelligence-based methods for fusion of electronic health records and imaging data.基于人工智能的电子健康记录与医学影像数据融合方法。
Sci Rep. 2022 Oct 26;12(1):17981. doi: 10.1038/s41598-022-22514-4.
3
Applications of Artificial Intelligence, Machine Learning, and Deep Learning in Nutrition: A Systematic Review.人工智能、机器学习和深度学习在营养领域的应用:系统评价。
Nutrients. 2024 Apr 6;16(7):1073. doi: 10.3390/nu16071073.
4
India's Potential as a Leader in Cancer Care Progress in the Future: A Synthetic Interdisciplinary Perspective.印度在癌症护理领域未来成为领导者的潜力:一个综合性跨学科视角
Cureus. 2024 Oct 5;16(10):e70892. doi: 10.7759/cureus.70892. eCollection 2024 Oct.
5
Open challenges and opportunities in federated foundation models towards biomedical healthcare.联合基础模型在生物医学医疗保健领域面临的公开挑战与机遇。
BioData Min. 2025 Jan 4;18(1):2. doi: 10.1186/s13040-024-00414-9.
6
Enhancing diagnostic deep learning via self-supervised pretraining on large-scale, unlabeled non-medical images.通过在大规模无标签非医学图像上进行自监督预训练来增强诊断深度学习。
Eur Radiol Exp. 2024 Feb 8;8(1):10. doi: 10.1186/s41747-023-00411-3.
7
Self-supervised learning for graph-structured data in healthcare applications: A comprehensive review.医疗保健应用中基于图结构数据的自监督学习:全面综述。
Comput Biol Med. 2025 Apr;188:109874. doi: 10.1016/j.compbiomed.2025.109874. Epub 2025 Feb 24.
8
Self-supervised learning in medicine and healthcare.医学和医疗保健中的自我监督学习。
Nat Biomed Eng. 2022 Dec;6(12):1346-1352. doi: 10.1038/s41551-022-00914-1. Epub 2022 Aug 11.
9
The future of Cochrane Neonatal.考克兰新生儿协作网的未来。
Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.
10
Generative spatial artificial intelligence for sustainable smart cities: A pioneering large flow model for urban digital twin.用于可持续智慧城市的生成式空间人工智能:一种用于城市数字孪生的开创性大流量模型。
Environ Sci Ecotechnol. 2025 Jan 15;24:100526. doi: 10.1016/j.ese.2025.100526. eCollection 2025 Mar.

本文引用的文献

1
A Comprehensive Survey of Foundation Models in Medicine.医学基础模型综合调查
IEEE Rev Biomed Eng. 2025 May 6;PP. doi: 10.1109/RBME.2025.3531360.
2
In-context learning enables multimodal large language models to classify cancer pathology images.语境学习使多模态大型语言模型能够对癌症病理学图像进行分类。
Nat Commun. 2024 Nov 21;15(1):10104. doi: 10.1038/s41467-024-51465-9.
3
The limits of fair medical imaging AI in real-world generalization.公平的医学影像 AI 在现实世界泛化中的局限性。
Nat Med. 2024 Oct;30(10):2838-2848. doi: 10.1038/s41591-024-03113-4. Epub 2024 Jun 28.
4
A multimodal generative AI copilot for human pathology.用于人体病理学的多模态生成式人工智能副驾。
Nature. 2024 Oct;634(8033):466-473. doi: 10.1038/s41586-024-07618-3. Epub 2024 Jun 12.
5
The impact of large language models on radiology: a guide for radiologists on the latest innovations in AI.大语言模型对放射学的影响:放射科医生了解 AI 最新创新的指南。
Jpn J Radiol. 2024 Jul;42(7):685-696. doi: 10.1007/s11604-024-01552-0. Epub 2024 Mar 29.
6
Adapted large language models can outperform medical experts in clinical text summarization.经过改编的大型语言模型在临床文本总结方面的表现优于医学专家。
Nat Med. 2024 Apr;30(4):1134-1142. doi: 10.1038/s41591-024-02855-5. Epub 2024 Feb 27.
7
Mapping medical image-text to a joint space via masked modeling.通过掩蔽建模将医学图像-文本映射到联合空间。
Med Image Anal. 2024 Jan;91:103018. doi: 10.1016/j.media.2023.103018. Epub 2023 Nov 4.
8
Self-supervised multi-modal training from uncurated images and reports enables monitoring AI in radiology.来自未经整理的图像和报告的自监督多模态训练能够实现放射学中的人工智能监测。
Med Image Anal. 2024 Jan;91:103021. doi: 10.1016/j.media.2023.103021. Epub 2023 Nov 7.
9
SELF-SUPERVISED LEARNING WITH RADIOLOGY REPORTS, A COMPARATIVE ANALYSIS OF STRATEGIES FOR LARGE VESSEL OCCLUSION AND BRAIN CTA IMAGES.基于放射学报告的自监督学习:大血管闭塞与脑CT血管造影图像策略的比较分析
Proc IEEE Int Symp Biomed Imaging. 2023 Apr;2023. doi: 10.1109/isbi53787.2023.10230623. Epub 2023 Sep 1.
10
A visual-language foundation model for pathology image analysis using medical Twitter.一种使用医学推特进行病理学图像分析的视觉语言基础模型。
Nat Med. 2023 Sep;29(9):2307-2316. doi: 10.1038/s41591-023-02504-3. Epub 2023 Aug 17.