用于计算病理学的视觉-语言基础模型。

A visual-language foundation model for computational pathology.

机构信息

Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.

Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.

出版信息

Nat Med. 2024 Mar;30(3):863-874. doi: 10.1038/s41591-024-02856-4. Epub 2024 Mar 19.

DOI:10.1038/s41591-024-02856-4

PMID:38504017

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11384335/

Abstract

The accelerated adoption of digital pathology and advances in deep learning have enabled the development of robust models for various pathology tasks across a diverse array of diseases and patient cohorts. However, model training is often difficult due to label scarcity in the medical domain, and a model's usage is limited by the specific task and disease for which it is trained. Additionally, most models in histopathology leverage only image data, a stark contrast to how humans teach each other and reason about histopathologic entities. We introduce CONtrastive learning from Captions for Histopathology (CONCH), a visual-language foundation model developed using diverse sources of histopathology images, biomedical text and, notably, over 1.17 million image-caption pairs through task-agnostic pretraining. Evaluated on a suite of 14 diverse benchmarks, CONCH can be transferred to a wide range of downstream tasks involving histopathology images and/or text, achieving state-of-the-art performance on histology image classification, segmentation, captioning, and text-to-image and image-to-text retrieval. CONCH represents a substantial leap over concurrent visual-language pretrained systems for histopathology, with the potential to directly facilitate a wide array of machine learning-based workflows requiring minimal or no further supervised fine-tuning.

摘要

数字病理学的加速采用和深度学习的进步使得针对各种病理学任务的强大模型得以开发，涵盖了广泛的疾病和患者群体。然而，由于医学领域标签稀缺，模型训练通常很困难，并且模型的使用受到其训练的特定任务和疾病的限制。此外，组织病理学中的大多数模型仅利用图像数据，这与人类相互教授和推理组织病理学实体的方式形成鲜明对比。我们引入了 CONtrastive learning from Captions for Histopathology (CONCH)，这是一种使用组织病理学图像、生物医学文本以及重要的超过 117 万张图像-标题对的各种来源通过无任务预设训练开发的视觉-语言基础模型。在 14 个不同基准的套件上进行评估，CONCH 可以转移到广泛的下游任务，涉及组织病理学图像和/或文本，在组织学图像分类、分割、标题生成以及文本到图像和图像到文本检索方面实现了最先进的性能。CONCH 代表了组织病理学领域中同期视觉-语言预训练系统的重大飞跃，有可能直接促进需要最小或无需进一步监督微调的广泛的基于机器学习的工作流程。

相似文献

A visual-language foundation model for computational pathology.

Nat Med. 2024 Mar;30(3):863-874. doi: 10.1038/s41591-024-02856-4. Epub 2024 Mar 19.

Towards Generating and Evaluating Iconographic Image Captions of Artworks.

J Imaging. 2021 Jul 23;7(8):123. doi: 10.3390/jimaging7080123.

Reducing annotation burden in MR: A novel MR-contrast guided contrastive learning approach for image segmentation.

Med Phys. 2024 Apr;51(4):2707-2720. doi: 10.1002/mp.16820. Epub 2023 Nov 13.

Towards a general-purpose foundation model for computational pathology.

Nat Med. 2024 Mar;30(3):850-862. doi: 10.1038/s41591-024-02857-3. Epub 2024 Mar 19.

Multimodal representations of biomedical knowledge from limited training whole slide images and reports using deep learning.

Med Image Anal. 2024 Oct;97:103303. doi: 10.1016/j.media.2024.103303. Epub 2024 Aug 14.

Advancing Accuracy in Multimodal Medical Tasks Through Bootstrapped Language-Image Pretraining (BioMedBLIP): Performance Evaluation Study.

JMIR Med Inform. 2024 Aug 5;12:e56627. doi: 10.2196/56627.

A Multilevel Transfer Learning Technique and LSTM Framework for Generating Medical Captions for Limited CT and DBT Images.

J Digit Imaging. 2022 Jun;35(3):564-580. doi: 10.1007/s10278-021-00567-7. Epub 2022 Feb 25.

ChampKit: A framework for rapid evaluation of deep neural networks for patch-based histopathology classification.

Comput Methods Programs Biomed. 2023 Sep;239:107631. doi: 10.1016/j.cmpb.2023.107631. Epub 2023 May 30.

Revolutionizing Digital Pathology With the Power of Generative Artificial Intelligence and Foundation Models.

Lab Invest. 2023 Nov;103(11):100255. doi: 10.1016/j.labinv.2023.100255. Epub 2023 Sep 26.

Transformer-based unsupervised contrastive learning for histopathological image classification.

Med Image Anal. 2022 Oct;81:102559. doi: 10.1016/j.media.2022.102559. Epub 2022 Jul 30.

引用本文的文献

Multimodal integration strategies for clinical application in oncology.

Front Pharmacol. 2025 Aug 20;16:1609079. doi: 10.3389/fphar.2025.1609079. eCollection 2025.

Patritumab deruxtecan in HRHER2 advanced breast cancer: a phase 2 trial.

Nat Med. 2025 Sep 4. doi: 10.1038/s41591-025-03885-3.

A generalizable pathology foundation model using a unified knowledge distillation pretraining framework.

Nat Biomed Eng. 2025 Sep 2. doi: 10.1038/s41551-025-01488-4.

An eyecare foundation model for clinical assistance: a randomized controlled trial.

Nat Med. 2025 Aug 28. doi: 10.1038/s41591-025-03900-7.

MERGE: Multi-faceted Hierarchical Graph-based GNN for Gene Expression Prediction from Whole Slide Histopathology Images.

Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2025 Jun;2025:15611-15620. doi: 10.1109/cvpr52734.2025.01455. Epub 2025 Aug 13.

GastritisMIL: An interpretable deep learning model for the comprehensive histological assessment of chronic gastritis.

Patterns (N Y). 2025 Jun 10;6(8):101286. doi: 10.1016/j.patter.2025.101286. eCollection 2025 Aug 8.

Specialized curricula for training vision language models in retinal image analysis.

NPJ Digit Med. 2025 Aug 19;8(1):532. doi: 10.1038/s41746-025-01893-8.

ROSIE: AI generation of multiplex immunofluorescence staining from histopathology images.

Nat Commun. 2025 Aug 16;16(1):7633. doi: 10.1038/s41467-025-62346-0.

Transforming the Primary Care Journey with Generative AI: A Foundation Model to Boost Efficiency, Quality, and Engagement.

J Gen Intern Med. 2025 Aug 13. doi: 10.1007/s11606-025-09716-y.

Prompt injection attacks on vision-language models for surgical decision support.

medRxiv. 2025 Jul 23:2025.07.16.25331645. doi: 10.1101/2025.07.16.25331645.

本文引用的文献

A visual-language foundation model for pathology image analysis using medical Twitter.

Nat Med. 2023 Sep;29(9):2307-2316. doi: 10.1038/s41591-023-02504-3. Epub 2023 Aug 17.

Self-supervised attention-based deep learning for pan-cancer mutation prediction from histopathology.

NPJ Precis Oncol. 2023 Mar 28;7(1):35. doi: 10.1038/s41698-023-00365-0.

Artificial intelligence reveals features associated with breast cancer neoadjuvant chemotherapy responses from multi-stain histopathologic images.

NPJ Precis Oncol. 2023 Jan 27;7(1):14. doi: 10.1038/s41698-023-00352-5.

Multistain deep learning for prediction of prognosis and therapy response in colorectal cancer.

Nat Med. 2023 Feb;29(2):430-439. doi: 10.1038/s41591-022-02134-1. Epub 2023 Jan 9.

An accurate prediction of the origin for bone metastatic cancer using deep learning on digital pathological images.

EBioMedicine. 2023 Jan;87:104426. doi: 10.1016/j.ebiom.2022.104426. Epub 2022 Dec 26.

RetCCL: Clustering-guided contrastive learning for whole-slide image retrieval.

Med Image Anal. 2023 Jan;83:102645. doi: 10.1016/j.media.2022.102645. Epub 2022 Oct 1.

Artificial intelligence for multimodal data integration in oncology.

Cancer Cell. 2022 Oct 10;40(10):1095-1110. doi: 10.1016/j.ccell.2022.09.012.

Fast and scalable search of whole-slide images via self-supervised deep learning.

Nat Biomed Eng. 2022 Dec;6(12):1420-1434. doi: 10.1038/s41551-022-00929-8. Epub 2022 Oct 10.

BioGPT: generative pre-trained transformer for biomedical text generation and mining.

Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac409.

Artificial intelligence in histopathology: enhancing cancer research and clinical oncology.

Nat Cancer. 2022 Sep;3(9):1026-1038. doi: 10.1038/s43018-022-00436-4. Epub 2022 Sep 22.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于计算病理学的视觉-语言基础模型。

A visual-language foundation model for computational pathology.

机构信息

Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.

Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.

出版信息

Nat Med. 2024 Mar;30(3):863-874. doi: 10.1038/s41591-024-02856-4. Epub 2024 Mar 19.

DOI:10.1038/s41591-024-02856-4

PMID:38504017

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11384335/

Abstract

摘要

用于计算病理学的视觉-语言基础模型。

A visual-language foundation model for computational pathology.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

用于计算病理学的视觉-语言基础模型。

A visual-language foundation model for computational pathology.

机构信息

出版信息