一种适用于多种生物医学任务的通才视觉语言基础模型。

A generalist vision-language foundation model for diverse biomedical tasks.

机构信息

Department of Computer Science and Engineering, Lehigh University, Bethlehem, PA, USA.

School of Computing, University of Georgia, Athens, GA, USA.

出版信息

Nat Med. 2024 Nov;30(11):3129-3141. doi: 10.1038/s41591-024-03185-2. Epub 2024 Aug 7.

DOI:10.1038/s41591-024-03185-2

PMID:39112796

Abstract

Traditional biomedical artificial intelligence (AI) models, designed for specific tasks or modalities, often exhibit limited flexibility in real-world deployment and struggle to utilize holistic information. Generalist AI holds the potential to address these limitations due to its versatility in interpreting different data types and generating tailored outputs for diverse needs. However, existing biomedical generalist AI solutions are typically heavyweight and closed source to researchers, practitioners and patients. Here, we describe BiomedGPT, the first open-source and lightweight vision-language foundation model, designed as a generalist capable of performing various biomedical tasks. BiomedGPT achieved state-of-the-art results in 16 out of 25 experiments while maintaining a computing-friendly model scale. We also conducted human evaluations to assess the capabilities of BiomedGPT in radiology visual question answering, report generation and summarization. BiomedGPT exhibits robust prediction ability with a low error rate of 3.8% in question answering, satisfactory performance with an error rate of 8.3% in writing complex radiology reports, and competitive summarization ability with a nearly equivalent preference score to human experts. Our method demonstrates that effective training with diverse data can lead to more practical biomedical AI for improving diagnosis and workflow efficiency.

摘要

传统的生物医学人工智能 (AI) 模型，专为特定任务或模态设计，在实际部署中往往缺乏灵活性，难以利用整体信息。由于通用 AI 能够解释不同类型的数据并为各种需求生成定制化的输出，因此具有解决这些限制的潜力。然而，现有的生物医学通用 AI 解决方案通常对研究人员、从业者和患者来说过于复杂且是闭源的。在这里，我们描述了 BiomedGPT，这是第一个开源的轻量级视觉语言基础模型，旨在成为一个能够执行各种生物医学任务的通用 AI。BiomedGPT 在 25 项实验中的 16 项中取得了最先进的结果，同时保持了计算友好的模型规模。我们还进行了人类评估，以评估 BiomedGPT 在放射科视觉问答、报告生成和总结方面的能力。BiomedGPT 在问答中表现出强大的预测能力，错误率为 3.8%；在撰写复杂的放射科报告方面表现出令人满意的性能，错误率为 8.3%；在总结方面具有竞争力，偏好得分与人类专家几乎相当。我们的方法表明，通过多样化的数据进行有效的训练可以为提高诊断和工作流程效率提供更实用的生物医学 AI。

相似文献

A generalist vision-language foundation model for diverse biomedical tasks.

Nat Med. 2024 Nov;30(11):3129-3141. doi: 10.1038/s41591-024-03185-2. Epub 2024 Aug 7.

Prescription of Controlled Substances: Benefits and Risks

Diabetic retinopathy screening through artificial intelligence algorithms: A systematic review.

Surv Ophthalmol. 2024 Sep-Oct;69(5):707-721. doi: 10.1016/j.survophthal.2024.05.008. Epub 2024 Jun 15.

Implementing Large Language Models in Health Care: Clinician-Focused Review With Interactive Guideline.

J Med Internet Res. 2025 Jul 11;27:e71916. doi: 10.2196/71916.

Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.

Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.

Artificial intelligence-simplified information to advance reproductive genetic literacy and health equity.

Hum Reprod. 2025 Jul 22. doi: 10.1093/humrep/deaf135.

Radiology report generation using automatic keyword adaptation, frequency-based multi-label classification and text-to-text large language models.

Comput Biol Med. 2025 Jul 3;196(Pt A):110625. doi: 10.1016/j.compbiomed.2025.110625.

Enhancing education for children with ASD: a review of evaluation and measurement in AI tool implementation.

Disabil Rehabil Assist Technol. 2025 Mar 13:1-18. doi: 10.1080/17483107.2025.2477678.

Artificial intelligence for detecting keratoconus.

Cochrane Database Syst Rev. 2023 Nov 15;11(11):CD014911. doi: 10.1002/14651858.CD014911.pub2.

Artificial intelligence for diagnosing exudative age-related macular degeneration.

Cochrane Database Syst Rev. 2024 Oct 17;10(10):CD015522. doi: 10.1002/14651858.CD015522.pub2.

引用本文的文献

Multimodal integration strategies for clinical application in oncology.

Front Pharmacol. 2025 Aug 20;16:1609079. doi: 10.3389/fphar.2025.1609079. eCollection 2025.

Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data.

Nat Commun. 2025 Aug 23;16(1):7866. doi: 10.1038/s41467-025-62385-7.

Large language models for clinical decision support in gastroenterology and hepatology.

Nat Rev Gastroenterol Hepatol. 2025 Aug 22. doi: 10.1038/s41575-025-01108-1.

Non-parametric prediction of brain MRI microstructure using transfer learning.

Imaging Neurosci (Camb). 2025 Apr 30;3. doi: 10.1162/imag_a_00548. eCollection 2025.

Evaluating acute image ordering for real-world patient cases via language model alignment with radiological guidelines.

Commun Med (Lond). 2025 Aug 4;5(1):332. doi: 10.1038/s43856-025-01061-9.

A foundation model for human-AI collaboration in medical literature mining.

ArXiv. 2025 Jan 27:arXiv:2501.16255v1.

Integrated biotechnological and AI innovations for crop improvement.

Nature. 2025 Jul;643(8073):925-937. doi: 10.1038/s41586-025-09122-8. Epub 2025 Jul 23.

A graph transformer-based foundation model for brain functional connectivity network.

Pattern Recognit. 2026 Jan;169. doi: 10.1016/j.patcog.2025.111988. Epub 2025 Jun 10.

A perspective for adapting generalist AI to specialized medical AI applications and their challenges.

NPJ Digit Med. 2025 Jul 11;8(1):429. doi: 10.1038/s41746-025-01789-7.

Artificial intelligence in prostate cancer.

Chin Med J (Engl). 2025 Aug 5;138(15):1769-1782. doi: 10.1097/CM9.0000000000003689. Epub 2025 Jul 9.

本文引用的文献

Large language models in medicine.

Nat Med. 2023 Aug;29(8):1930-1940. doi: 10.1038/s41591-023-02448-8. Epub 2023 Jul 17.

Foundation models for generalist medical artificial intelligence.

Nature. 2023 Apr;616(7956):259-265. doi: 10.1038/s41586-023-05881-4. Epub 2023 Apr 12.

MedViT: A robust vision transformer for generalized medical image classification.

Comput Biol Med. 2023 May;157:106791. doi: 10.1016/j.compbiomed.2023.106791. Epub 2023 Mar 14.

BioGPT: generative pre-trained transformer for biomedical text generation and mining.

Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac409.

Development of the concept of patient-centredness - A systematic review.

Patient Educ Couns. 2019 Jul;102(7):1228-1236. doi: 10.1016/j.pec.2019.02.023. Epub 2019 Feb 27.

The Person-Centred Care Guideline: From Principle to Practice.

J Patient Exp. 2018 Dec;5(4):282-288. doi: 10.1177/2374373518765792. Epub 2018 Apr 4.

A dataset of clinically generated visual questions and answers about radiology images.

Sci Data. 2018 Nov 20;5:180251. doi: 10.1038/sdata.2018.251.

Preparing a collection of radiology examinations for distribution and retrieval.

J Am Med Inform Assoc. 2016 Mar;23(2):304-10. doi: 10.1093/jamia/ocv080. Epub 2015 Jul 1.

Cancer survival and incidence from the Surveillance, Epidemiology, and End Results (SEER) program.

Oncologist. 2003;8(6):541-52. doi: 10.1634/theoncologist.8-6-541.

Reducing the frequency of errors in medicine using information technology.

J Am Med Inform Assoc. 2001 Jul-Aug;8(4):299-308. doi: 10.1136/jamia.2001.0080299.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种适用于多种生物医学任务的通才视觉语言基础模型。

A generalist vision-language foundation model for diverse biomedical tasks.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献