Suppr超能文献

一种用于生成逼真胸部X光图像的视觉语言基础模型。

A vision-language foundation model for the generation of realistic chest X-ray images.

作者信息

Bluethgen Christian, Chambon Pierre, Delbrouck Jean-Benoit, van der Sluijs Rogier, Połacin Małgorzata, Zambrano Chaves Juan Manuel, Abraham Tanishq Mathew, Purohit Shivanshu, Langlotz Curtis P, Chaudhari Akshay S

机构信息

Center for Artificial Intelligence in Medicine and Imaging, Stanford University, Palo Alto, CA, USA.

Department of Radiology, Stanford University, Palo Alto, CA, USA.

出版信息

Nat Biomed Eng. 2025 Apr;9(4):494-506. doi: 10.1038/s41551-024-01246-y. Epub 2024 Aug 26.

Abstract

The paucity of high-quality medical imaging datasets could be mitigated by machine learning models that generate compositionally diverse images that faithfully represent medical concepts and pathologies. However, large vision-language models are trained on natural images, and the diversity distribution of the generated images substantially differs from that of medical images. Moreover, medical language involves specific and semantically rich vocabulary. Here we describe a domain-adaptation strategy for large vision-language models that overcomes distributional shifts. Specifically, by leveraging publicly available datasets of chest X-ray images and the corresponding radiology reports, we adapted a latent diffusion model pre-trained on pairs of natural images and text descriptors to generate diverse and visually plausible synthetic chest X-ray images (as confirmed by board-certified radiologists) whose appearance can be controlled with free-form medical text prompts. The domain-adaptation strategy for the text-conditioned synthesis of medical images can be used to augment training datasets and is a viable alternative to the sharing of real medical images for model training and fine-tuning.

摘要

高质量医学影像数据集的匮乏可以通过机器学习模型来缓解,这些模型能生成在成分上具有多样性且能如实呈现医学概念和病症的图像。然而,大型视觉语言模型是在自然图像上训练的,生成图像的多样性分布与医学图像的差异很大。此外,医学语言涉及特定且语义丰富的词汇。在此,我们描述一种针对大型视觉语言模型的领域适应策略,该策略可克服分布偏移。具体而言,通过利用公开可用的胸部X光图像数据集及相应的放射学报告,我们对一个在自然图像与文本描述符对上进行预训练的潜在扩散模型进行了调整,以生成多样且视觉上合理的合成胸部X光图像(经专业放射科医生确认),其外观可通过自由形式的医学文本提示进行控制。用于医学图像文本条件合成的领域适应策略可用于扩充训练数据集,并且是在模型训练和微调中共享真实医学图像的可行替代方案。

相似文献

9
BRAX, Brazilian labeled chest x-ray dataset.BRAX,巴西标注胸部 X 射线数据集。
Sci Data. 2022 Aug 10;9(1):487. doi: 10.1038/s41597-022-01608-8.

引用本文的文献

10
Large models in medical imaging: Advances and prospects.医学成像中的大模型:进展与展望。
Chin Med J (Engl). 2025 Jul 20;138(14):1647-1664. doi: 10.1097/CM9.0000000000003699. Epub 2025 Jun 20.

本文引用的文献

7
Self-supervised learning in medicine and healthcare.医学和医疗保健中的自我监督学习。
Nat Biomed Eng. 2022 Dec;6(12):1346-1352. doi: 10.1038/s41551-022-00914-1. Epub 2022 Aug 11.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验