• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于真实世界数据的全幻灯片数字病理学基础模型。

A whole-slide foundation model for digital pathology from real-world data.

机构信息

Microsoft Research, Redmond, WA, USA.

Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA.

出版信息

Nature. 2024 Jun;630(8015):181-188. doi: 10.1038/s41586-024-07441-w. Epub 2024 May 22.

DOI:10.1038/s41586-024-07441-w
PMID:38778098
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11153137/
Abstract

Digital pathology poses unique computational challenges, as a standard gigapixel slide may comprise tens of thousands of image tiles. Prior models have often resorted to subsampling a small portion of tiles for each slide, thus missing the important slide-level context. Here we present Prov-GigaPath, a whole-slide pathology foundation model pretrained on 1.3 billion 256 × 256 pathology image tiles in 171,189 whole slides from Providence, a large US health network comprising 28 cancer centres. The slides originated from more than 30,000 patients covering 31 major tissue types. To pretrain Prov-GigaPath, we propose GigaPath, a novel vision transformer architecture for pretraining gigapixel pathology slides. To scale GigaPath for slide-level learning with tens of thousands of image tiles, GigaPath adapts the newly developed LongNet method to digital pathology. To evaluate Prov-GigaPath, we construct a digital pathology benchmark comprising 9 cancer subtyping tasks and 17 pathomics tasks, using both Providence and TCGA data. With large-scale pretraining and ultra-large-context modelling, Prov-GigaPath attains state-of-the-art performance on 25 out of 26 tasks, with significant improvement over the second-best method on 18 tasks. We further demonstrate the potential of Prov-GigaPath on vision-language pretraining for pathology by incorporating the pathology reports. In sum, Prov-GigaPath is an open-weight foundation model that achieves state-of-the-art performance on various digital pathology tasks, demonstrating the importance of real-world data and whole-slide modelling.

摘要

数字病理学带来了独特的计算挑战,因为一个标准的千兆像素幻灯片可能包含成千上万张图像块。以前的模型通常会对每张幻灯片的一小部分图像块进行子采样,从而丢失了重要的幻灯片级上下文。在这里,我们提出了 Prov-GigaPath,这是一个在 171189 张来自普罗维登斯的全幻灯片上,用 13 亿个 256×256 病理图像块进行预训练的全幻灯片病理基础模型,普罗维登斯是一个大型美国健康网络,包括 28 个癌症中心。这些幻灯片来自超过 30000 名患者,涵盖 31 种主要组织类型。为了预训练 Prov-GigaPath,我们提出了 GigaPath,这是一种用于预训练千兆像素病理幻灯片的新型视觉转换器架构。为了在具有成千上万张图像块的幻灯片级别上扩展 GigaPath 的学习能力,GigaPath 采用了新开发的 LongNet 方法来适应数字病理学。为了评估 Prov-GigaPath,我们使用普罗维登斯和 TCGA 数据构建了一个包含 9 个癌症亚型任务和 17 个病理组学任务的数字病理学基准。通过大规模预训练和超大上下文建模,Prov-GigaPath 在 26 个任务中的 25 个任务上达到了最先进的性能,在 18 个任务上比第二好的方法有显著的改进。我们进一步通过整合病理报告,展示了 Prov-GigaPath 在病理视觉语言预训练方面的潜力。总之,Prov-GigaPath 是一个开放权重的基础模型,在各种数字病理学任务上都达到了最先进的性能,证明了真实世界数据和全幻灯片建模的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/f1f711ab9558/41586_2024_7441_Fig13_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/b13f22c41d6d/41586_2024_7441_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/304177bbdd1b/41586_2024_7441_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/e1d892dbbce6/41586_2024_7441_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/80e9c248842d/41586_2024_7441_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/dd739c0258aa/41586_2024_7441_Fig5_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/ea027ebbd408/41586_2024_7441_Fig6_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/64c69fbaeaf2/41586_2024_7441_Fig7_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/020370a9f256/41586_2024_7441_Fig8_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/2ac8cb894af1/41586_2024_7441_Fig9_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/a958a6518a02/41586_2024_7441_Fig10_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/3925ae9378e5/41586_2024_7441_Fig11_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/c399092a044d/41586_2024_7441_Fig12_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/f1f711ab9558/41586_2024_7441_Fig13_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/b13f22c41d6d/41586_2024_7441_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/304177bbdd1b/41586_2024_7441_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/e1d892dbbce6/41586_2024_7441_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/80e9c248842d/41586_2024_7441_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/dd739c0258aa/41586_2024_7441_Fig5_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/ea027ebbd408/41586_2024_7441_Fig6_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/64c69fbaeaf2/41586_2024_7441_Fig7_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/020370a9f256/41586_2024_7441_Fig8_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/2ac8cb894af1/41586_2024_7441_Fig9_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/a958a6518a02/41586_2024_7441_Fig10_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/3925ae9378e5/41586_2024_7441_Fig11_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/c399092a044d/41586_2024_7441_Fig12_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1832/11153137/f1f711ab9558/41586_2024_7441_Fig13_ESM.jpg

相似文献

1
A whole-slide foundation model for digital pathology from real-world data.基于真实世界数据的全幻灯片数字病理学基础模型。
Nature. 2024 Jun;630(8015):181-188. doi: 10.1038/s41586-024-07441-w. Epub 2024 May 22.
2
Operational greenhouse-gas emissions of deep learning in digital pathology: a modelling study.深度学习在数字病理学中的运营温室气体排放:建模研究。
Lancet Digit Health. 2024 Jan;6(1):e58-e69. doi: 10.1016/S2589-7500(23)00219-4. Epub 2023 Nov 22.
3
Whole slide imaging equivalency and efficiency study: experience at a large academic center.全 slides 成像等效性和效率研究:大型学术中心的经验。
Mod Pathol. 2019 Jul;32(7):916-928. doi: 10.1038/s41379-019-0205-0. Epub 2019 Feb 18.
4
iPathology cockpit diagnostic station: validation according to College of American Pathologists Pathology and Laboratory Quality Center recommendation at the Hospital Trust and University of Verona.iPathology 座舱诊断工作站:根据美国病理学家学会病理与实验室质量中心的建议,在维罗纳医院信托和大学进行验证。
Diagn Pathol. 2014;9 Suppl 1(Suppl 1):S12. doi: 10.1186/1746-1596-9-S1-S12. Epub 2014 Dec 19.
5
Masked hypergraph learning for weakly supervised histopathology whole slide image classification.基于掩蔽超图学习的弱监督病理切片图像分类。
Comput Methods Programs Biomed. 2024 Aug;253:108237. doi: 10.1016/j.cmpb.2024.108237. Epub 2024 May 23.
6
Revolutionizing Digital Pathology With the Power of Generative Artificial Intelligence and Foundation Models.利用生成式人工智能和基础模型推动数字病理学革命。
Lab Invest. 2023 Nov;103(11):100255. doi: 10.1016/j.labinv.2023.100255. Epub 2023 Sep 26.
7
Unsupervised mutual transformer learning for multi-gigapixel Whole Slide Image classification.无监督的多千兆像素全幻灯片图像分类的互变压器学习。
Med Image Anal. 2024 Aug;96:103203. doi: 10.1016/j.media.2024.103203. Epub 2024 May 21.
8
Cancer Digital Slide Archive: an informatics resource to support integrated in silico analysis of TCGA pathology data.癌症数字切片档案:一个信息学资源,支持 TCGA 病理数据的综合计算机分析。
J Am Med Inform Assoc. 2013 Nov-Dec;20(6):1091-8. doi: 10.1136/amiajnl-2012-001469. Epub 2013 Jul 25.
9
Deep computational pathology in breast cancer.深度学习在乳腺癌中的应用。
Semin Cancer Biol. 2021 Jul;72:226-237. doi: 10.1016/j.semcancer.2020.08.006. Epub 2020 Aug 17.
10
Automated curation of large-scale cancer histopathology image datasets using deep learning.利用深度学习对大规模癌症组织病理学图像数据集进行自动化注释。
Histopathology. 2024 Jun;84(7):1139-1153. doi: 10.1111/his.15159. Epub 2024 Feb 26.

引用本文的文献

1
MorphoITH: a framework for deconvolving intra-tumor heterogeneity using tissue morphology.MorphoITH:一种利用组织形态学对肿瘤内异质性进行反卷积的框架。
Genome Med. 2025 Sep 19;17(1):101. doi: 10.1186/s13073-025-01504-x.
2
From large language models to multimodal AI: a scoping review on the potential of generative AI in medicine.从大语言模型到多模态人工智能:关于生成式人工智能在医学领域潜力的范围综述
Biomed Eng Lett. 2025 Aug 22;15(5):845-863. doi: 10.1007/s13534-025-00497-1. eCollection 2025 Sep.
3
Multimodal integration strategies for clinical application in oncology.

本文引用的文献

1
Towards a general-purpose foundation model for computational pathology.迈向计算病理学的通用基础模型。
Nat Med. 2024 Mar;30(3):850-862. doi: 10.1038/s41591-024-02857-3. Epub 2024 Mar 19.
2
The digital-physical divide for pathology research.病理学研究中的数字与实体鸿沟。
Lancet Digit Health. 2023 Dec;5(12):e859-e861. doi: 10.1016/S2589-7500(23)00184-X.
3
A foundation model for generalizable disease detection from retinal images.基于视网膜图像的通用疾病检测的基础模型。
肿瘤学临床应用中的多模态整合策略
Front Pharmacol. 2025 Aug 20;16:1609079. doi: 10.3389/fphar.2025.1609079. eCollection 2025.
4
Artificial Intelligence for Multiscale Spatial Analysis in Oncology: Current Applications and Future Implications.用于肿瘤多尺度空间分析的人工智能:当前应用与未来影响
Int J Mol Sci. 2025 Aug 19;26(16):8002. doi: 10.3390/ijms26168002.
5
When biomedical discovery faces data barriers: building a governance-empowered framework for resilient collaboration.当生物医学发现面临数据障碍时:构建一个由治理赋能的弹性协作框架。
Mol Syst Biol. 2025 Aug 26. doi: 10.1038/s44320-025-00138-w.
6
GastritisMIL: An interpretable deep learning model for the comprehensive histological assessment of chronic gastritis.胃炎MIL:一种用于慢性胃炎综合组织学评估的可解释深度学习模型。
Patterns (N Y). 2025 Jun 10;6(8):101286. doi: 10.1016/j.patter.2025.101286. eCollection 2025 Aug 8.
7
HallmarkGraph: a cancer hallmark informed graph neural network for classifying hierarchical tumor subtypes.标志性图:一种基于癌症特征的图神经网络,用于对肿瘤亚型进行分层分类。
Bioinformatics. 2025 Sep 1;41(9). doi: 10.1093/bioinformatics/btaf444.
8
MiroSCOPE: An AI-driven digital pathology platform for annotating functional tissue units.MiroSCOPE:一个用于标注功能性组织单位的人工智能驱动的数字病理学平台。
bioRxiv. 2025 Jul 17:2025.07.11.664228. doi: 10.1101/2025.07.11.664228.
9
DREAM: A framework for discovering mechanisms underlying AI prediction of protected attributes.DREAM:一种用于发现人工智能对受保护属性预测背后机制的框架。
medRxiv. 2025 Jul 21:2024.04.09.24305289. doi: 10.1101/2024.04.09.24305289.
10
Deep-learning triage of 3D pathology datasets for comprehensive and efficient pathologist assessments.用于全面高效的病理学家评估的3D病理数据集的深度学习分类
bioRxiv. 2025 Jul 22:2025.07.20.665804. doi: 10.1101/2025.07.20.665804.
Nature. 2023 Oct;622(7981):156-163. doi: 10.1038/s41586-023-06555-x. Epub 2023 Sep 13.
4
A visual-language foundation model for pathology image analysis using medical Twitter.一种使用医学推特进行病理学图像分析的视觉语言基础模型。
Nat Med. 2023 Sep;29(9):2307-2316. doi: 10.1038/s41591-023-02504-3. Epub 2023 Aug 17.
5
Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging.用于诊断成像的自监督机器学习的鲁棒且数据高效的泛化。
Nat Biomed Eng. 2023 Jun;7(6):756-779. doi: 10.1038/s41551-023-01049-7. Epub 2023 Jun 8.
6
Health system-scale language models are all-purpose prediction engines.健康系统规模的语言模型是通用的预测引擎。
Nature. 2023 Jul;619(7969):357-362. doi: 10.1038/s41586-023-06160-y. Epub 2023 Jun 7.
7
Transfer learning enables predictions in network biology.迁移学习可实现网络生物学预测。
Nature. 2023 Jun;618(7965):616-624. doi: 10.1038/s41586-023-06139-9. Epub 2023 May 31.
8
Histopathology images predict multi-omics aberrations and prognoses in colorectal cancer patients.组织病理学图像预测结直肠癌患者的多组学异常和预后。
Nat Commun. 2023 Apr 13;14(1):2102. doi: 10.1038/s41467-023-37179-4.
9
Artificial intelligence reveals features associated with breast cancer neoadjuvant chemotherapy responses from multi-stain histopathologic images.人工智能从多染色组织病理学图像中揭示与乳腺癌新辅助化疗反应相关的特征。
NPJ Precis Oncol. 2023 Jan 27;7(1):14. doi: 10.1038/s41698-023-00352-5.
10
A deep-learning model for transforming the style of tissue images from cryosectioned to formalin-fixed and paraffin-embedded.一种用于将组织图像的样式从冷冻切片转换为福尔马林固定和石蜡包埋的深度学习模型。
Nat Biomed Eng. 2022 Dec;6(12):1407-1419. doi: 10.1038/s41551-022-00952-9. Epub 2022 Dec 23.