文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

用于精准肿瘤学的视觉语言基础模型。

A vision-language foundation model for precision oncology.

作者信息

Xiang Jinxi, Wang Xiyue, Zhang Xiaoming, Xi Yinghua, Eweje Feyisope, Chen Yijiang, Li Yuchen, Bergstrom Colin, Gopaulchan Matthew, Kim Ted, Yu Kun-Hsing, Willens Sierra, Olguin Francesca Maria, Nirschl Jeffrey J, Neal Joel, Diehn Maximilian, Yang Sen, Li Ruijiang

机构信息

Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA.

Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA.

出版信息

Nature. 2025 Feb;638(8051):769-778. doi: 10.1038/s41586-024-08378-w. Epub 2025 Jan 8.


DOI:10.1038/s41586-024-08378-w
PMID:39779851
Abstract

Clinical decision-making is driven by multimodal data, including clinical notes and pathological characteristics. Artificial intelligence approaches that can effectively integrate multimodal data hold significant promise in advancing clinical care. However, the scarcity of well-annotated multimodal datasets in clinical settings has hindered the development of useful models. In this study, we developed the Multimodal transformer with Unified maSKed modeling (MUSK), a vision-language foundation model designed to leverage large-scale, unlabelled, unpaired image and text data. MUSK was pretrained on 50 million pathology images from 11,577 patients and one billion pathology-related text tokens using unified masked modelling. It was further pretrained on one million pathology image-text pairs to efficiently align the vision and language features. With minimal or no further training, MUSK was tested in a wide range of applications and demonstrated superior performance across 23 patch-level and slide-level benchmarks, including image-to-text and text-to-image retrieval, visual question answering, image classification and molecular biomarker prediction. Furthermore, MUSK showed strong performance in outcome prediction, including melanoma relapse prediction, pan-cancer prognosis prediction and immunotherapy response prediction in lung and gastro-oesophageal cancers. MUSK effectively combined complementary information from pathology images and clinical reports and could potentially improve diagnosis and precision in cancer therapy.

摘要

临床决策由多模态数据驱动,包括临床记录和病理特征。能够有效整合多模态数据的人工智能方法在推进临床护理方面具有重大前景。然而,临床环境中注释良好的多模态数据集的稀缺阻碍了有用模型的开发。在本研究中,我们开发了具有统一掩码建模的多模态变换器(MUSK),这是一种旨在利用大规模、未标记、未配对的图像和文本数据的视觉语言基础模型。MUSK使用统一掩码建模在来自11577名患者的5000万张病理图像和10亿个与病理相关的文本标记上进行了预训练。它在100万个病理图像-文本对上进一步预训练,以有效地对齐视觉和语言特征。经过最少或无需进一步训练,MUSK在广泛的应用中进行了测试,并在23个斑块级和玻片级基准测试中表现出色,包括图像到文本和文本到图像检索、视觉问答、图像分类和分子生物标志物预测。此外,MUSK在结果预测方面表现出色,包括黑色素瘤复发预测、泛癌预后预测以及肺癌和胃食管癌的免疫治疗反应预测。MUSK有效地结合了病理图像和临床报告中的互补信息,并有可能提高癌症治疗的诊断和精准度。

相似文献

[1]
A vision-language foundation model for precision oncology.

Nature. 2025-2

[2]
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.

Clin Orthop Relat Res. 2024-12-1

[3]
Artificial intelligence for diagnosing exudative age-related macular degeneration.

Cochrane Database Syst Rev. 2024-10-17

[4]
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.

Cochrane Database Syst Rev. 2022-5-20

[5]
Artificial intelligence entering the pathology arena in oncology: current applications and future perspectives.

Ann Oncol. 2025-4-28

[6]
Performance of Multimodal Artificial Intelligence Chatbots Evaluated on Clinical Oncology Cases.

JAMA Netw Open. 2024-10-1

[7]
Pharmacological treatment of children with gastro-oesophageal reflux.

Cochrane Database Syst Rev. 2014-11-24

[8]
Multi-resolution vision transformer model for histopathological skin cancer subtype classification using whole slide images.

Comput Biol Med. 2025-9

[9]
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.

Health Technol Assess. 2006-9

[10]
Artificial intelligence for detecting keratoconus.

Cochrane Database Syst Rev. 2023-11-15

引用本文的文献

[1]
Computational pathology annotation enhances the resolution and interpretation of breast cancer spatial transcriptomics data.

NPJ Precis Oncol. 2025-9-9

[2]
Multimodal integration strategies for clinical application in oncology.

Front Pharmacol. 2025-8-20

[3]
Deep-learning triage of 3D pathology datasets for comprehensive and efficient pathologist assessments.

bioRxiv. 2025-7-22

[4]
Artificial intelligence-driven pathomics in hepatocellular carcinoma: current developments, challenges and perspectives.

Discov Oncol. 2025-7-28

[5]
Machine learning approaches for EGFR mutation status prediction in NSCLC: an updated systematic review.

Front Oncol. 2025-7-10

[6]
Spatial multi-omics and deep learning reveal fingerprints of immunotherapy response and resistance in hepatocellular carcinoma.

bioRxiv. 2025-6-12

[7]
AI-enabled molecular phenotyping and prognostic predictions in lung cancer through multimodal clinical information integration.

Cell Rep Med. 2025-7-15

[8]
Foundation models and intelligent decision-making: Progress, challenges, and perspectives.

Innovation (Camb). 2025-5-12

[9]
PixCell: A generative foundation model for digital histopathology images.

ArXiv. 2025-6-5

[10]
Large Language Models in Cancer Imaging: Applications and Future Perspectives.

J Clin Med. 2025-5-8

本文引用的文献

[1]
A pathology foundation model for cancer diagnosis and prognosis prediction.

Nature. 2024-10

[2]
A foundation model for clinical-grade computational pathology and rare cancers detection.

Nat Med. 2024-10

[3]
A multimodal generative AI copilot for human pathology.

Nature. 2024-10

[4]
A whole-slide foundation model for digital pathology from real-world data.

Nature. 2024-6

[5]
Quilt-1M: One Million Image-Text Pairs for Histopathology.

Adv Neural Inf Process Syst. 2023-12

[6]
Vision-language foundation model for echocardiogram interpretation.

Nat Med. 2024-5

[7]
Transparent medical image AI via an image-text foundation model grounded in medical literature.

Nat Med. 2024-4

[8]
Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.

CA Cancer J Clin. 2024

[9]
Towards a general-purpose foundation model for computational pathology.

Nat Med. 2024-3

[10]
A visual-language foundation model for computational pathology.

Nat Med. 2024-3

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索