文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

用于人体病理学的多模态生成式人工智能副驾。

A multimodal generative AI copilot for human pathology.

机构信息

Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.

Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.

出版信息

Nature. 2024 Oct;634(8033):466-473. doi: 10.1038/s41586-024-07618-3. Epub 2024 Jun 12.


DOI:10.1038/s41586-024-07618-3
PMID:38866050
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11464372/
Abstract

Computational pathology has witnessed considerable progress in the development of both task-specific predictive models and task-agnostic self-supervised vision encoders. However, despite the explosive growth of generative artificial intelligence (AI), there have been few studies on building general-purpose multimodal AI assistants and copilots tailored to pathology. Here we present PathChat, a vision-language generalist AI assistant for human pathology. We built PathChat by adapting a foundational vision encoder for pathology, combining it with a pretrained large language model and fine-tuning the whole system on over 456,000 diverse visual-language instructions consisting of 999,202 question and answer turns. We compare PathChat with several multimodal vision-language AI assistants and GPT-4V, which powers the commercially available multimodal general-purpose AI assistant ChatGPT-4 (ref. ). PathChat achieved state-of-the-art performance on multiple-choice diagnostic questions from cases with diverse tissue origins and disease models. Furthermore, using open-ended questions and human expert evaluation, we found that overall PathChat produced more accurate and pathologist-preferable responses to diverse queries related to pathology. As an interactive vision-language AI copilot that can flexibly handle both visual and natural language inputs, PathChat may potentially find impactful applications in pathology education, research and human-in-the-loop clinical decision-making.

摘要

计算病理学在任务特定的预测模型和任务不可知的自监督视觉编码器的发展方面取得了相当大的进展。然而,尽管生成式人工智能 (AI) 呈爆炸式增长,但针对病理学构建通用多模态 AI 助手和副驾的研究却很少。在这里,我们提出了 PathChat,这是一种用于人体病理学的视觉语言通才 AI 助手。我们通过适应用于病理学的基础视觉编码器来构建 PathChat,将其与预训练的大型语言模型相结合,并在由 999,202 个问答轮次组成的超过 456,000 个不同的视觉语言指令上对整个系统进行微调。我们将 PathChat 与几个多模态视觉语言 AI 助手和 GPT-4V 进行了比较,后者为商业化的多模态通用 AI 助手 ChatGPT-4 提供支持(参考文献)。PathChat 在来自具有不同组织起源和疾病模型的病例的多项多项选择诊断问题上取得了最先进的性能。此外,通过使用开放式问题和人类专家评估,我们发现总体而言,PathChat 对与病理学相关的各种查询产生了更准确和病理学家更喜欢的响应。作为一种灵活处理视觉和自然语言输入的交互式视觉语言 AI 副驾,PathChat 可能在病理学教育、研究和人机交互临床决策方面具有重要的应用价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/61ff13923cd4/41586_2024_7618_Fig14_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/476c8a421ce3/41586_2024_7618_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/0d8c1619dae4/41586_2024_7618_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/47384047db6b/41586_2024_7618_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/67278077074f/41586_2024_7618_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/5d5e863747f1/41586_2024_7618_Fig5_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/bb4473a8fdf7/41586_2024_7618_Fig6_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/85a34d54933f/41586_2024_7618_Fig7_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/4167189a92c4/41586_2024_7618_Fig8_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/02ab74a3381a/41586_2024_7618_Fig9_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/ff229b6775ed/41586_2024_7618_Fig10_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/6f04d66992c4/41586_2024_7618_Fig11_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/3f37205ac598/41586_2024_7618_Fig12_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/b6f49c292a6b/41586_2024_7618_Fig13_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/61ff13923cd4/41586_2024_7618_Fig14_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/476c8a421ce3/41586_2024_7618_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/0d8c1619dae4/41586_2024_7618_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/47384047db6b/41586_2024_7618_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/67278077074f/41586_2024_7618_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/5d5e863747f1/41586_2024_7618_Fig5_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/bb4473a8fdf7/41586_2024_7618_Fig6_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/85a34d54933f/41586_2024_7618_Fig7_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/4167189a92c4/41586_2024_7618_Fig8_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/02ab74a3381a/41586_2024_7618_Fig9_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/ff229b6775ed/41586_2024_7618_Fig10_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/6f04d66992c4/41586_2024_7618_Fig11_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/3f37205ac598/41586_2024_7618_Fig12_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/b6f49c292a6b/41586_2024_7618_Fig13_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0874/11464372/61ff13923cd4/41586_2024_7618_Fig14_ESM.jpg

相似文献

[1]
A multimodal generative AI copilot for human pathology.

Nature. 2024-10

[2]
Prescription of Controlled Substances: Benefits and Risks

2025-1

[3]
Performance of 3 Conversational Generative Artificial Intelligence Models for Computing Maximum Safe Doses of Local Anesthetics: Comparative Analysis.

JMIR AI. 2025-5-13

[4]
AI in Medical Questionnaires: Innovations, Diagnosis, and Implications.

J Med Internet Res. 2025-6-23

[5]
Unveiling GPT-4V's hidden challenges behind high accuracy on USMLE questions: Observational Study.

J Med Internet Res. 2025-2-7

[6]
Navigating the future of pediatric cardiovascular surgery: Insights and innovation powered by Chat Generative Pre-Trained Transformer (ChatGPT).

J Thorac Cardiovasc Surg. 2025-2-1

[7]
Examining the Role of Large Language Models in Orthopedics: Systematic Review.

J Med Internet Res. 2024-11-15

[8]
Artificial intelligence for diagnosing exudative age-related macular degeneration.

Cochrane Database Syst Rev. 2024-10-17

[9]
Menstrual Health Education Using a Specialized Large Language Model in India: Development and Evaluation Study of MenstLLaMA.

J Med Internet Res. 2025-7-16

[10]
Artificial intelligence for detecting keratoconus.

Cochrane Database Syst Rev. 2023-11-15

引用本文的文献

[1]
Multimodal integration strategies for clinical application in oncology.

Front Pharmacol. 2025-8-20

[2]
An eyecare foundation model for clinical assistance: a randomized controlled trial.

Nat Med. 2025-8-28

[3]
HistoChat: Instruction-tuning multimodal vision language assistant for colorectal histopathology on limited data.

Patterns (N Y). 2025-5-30

[4]
Digital and Artificial Intelligence-based Pathology: Not for Every Laboratory - A Mini-review on the Benefits and Pitfalls of Its Implementation.

J Clin Transl Pathol. 2025-6

[5]
A vision of human-AI collaboration for enhanced biological collection curation and research.

Bioscience. 2025-3-28

[6]
Democratizing advanced surgical guidance: decoupling the state-of-the-art from tertiary centers and breaking trail for autonomous robotic surgery in austere environments.

Proc SPIE Int Soc Opt Eng. 2025-1-19

[7]
Utilizing multimodal artificial intelligence to advance cardiovascular diseases.

Precis Clin Med. 2025-7-17

[8]
Prompt injection attacks on vision-language models for surgical decision support.

medRxiv. 2025-7-23

[9]
Deep-learning triage of 3D pathology datasets for comprehensive and efficient pathologist assessments.

bioRxiv. 2025-7-22

[10]
A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches.

Brief Bioinform. 2025-7-2

本文引用的文献

[1]
A visual-language foundation model for computational pathology.

Nat Med. 2024-3

[2]
Transformer-based biomarker prediction from colorectal cancer histology: A large-scale multicentric study.

Cancer Cell. 2023-9-11

[3]
A visual-language foundation model for pathology image analysis using medical Twitter.

Nat Med. 2023-9

[4]
Large language models encode clinical knowledge.

Nature. 2023-8

[5]
Foundation models for generalist medical artificial intelligence.

Nature. 2023-4

[6]
Artificial intelligence reveals features associated with breast cancer neoadjuvant chemotherapy responses from multi-stain histopathologic images.

NPJ Precis Oncol. 2023-1-27

[7]
One model is all you need: Multi-task learning enables simultaneous histology image segmentation and classification.

Med Image Anal. 2023-1

[8]
Artificial intelligence for multimodal data integration in oncology.

Cancer Cell. 2022-10-10

[9]
Artificial intelligence in histopathology: enhancing cancer research and clinical oncology.

Nat Cancer. 2022-9

[10]
Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning.

Nat Biomed Eng. 2022-12

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索