基于句子级图像-语言对比学习的多粒度放射学报告生成

Multi-Grained Radiology Report Generation With Sentence-Level Image-Language Contrastive Learning.

作者信息

Liu Aohan, Guo Yuchen, Yong Jun-Hai, Xu Feng

出版信息

IEEE Trans Med Imaging. 2024 Jul;43(7):2657-2669. doi: 10.1109/TMI.2024.3372638. Epub 2024 Jul 1.

DOI:10.1109/TMI.2024.3372638

Abstract

The automatic generation of accurate radiology reports is of great clinical importance and has drawn growing research interest. However, it is still a challenging task due to the imbalance between normal and abnormal descriptions and the multi-sentence and multi-topic nature of radiology reports. These features result in significant challenges to generating accurate descriptions for medical images, especially the important abnormal findings. Previous methods to tackle these problems rely heavily on extra manual annotations, which are expensive to acquire. We propose a multi-grained report generation framework incorporating sentence-level image-sentence contrastive learning, which does not require any extra labeling but effectively learns knowledge from the image-report pairs. We first introduce contrastive learning as an auxiliary task for image feature learning. Different from previous contrastive methods, we exploit the multi-topic nature of imaging reports and perform fine-grained contrastive learning by extracting sentence topics and contents and contrasting between sentence contents and refined image contents guided by sentence topics. This forces the model to learn distinct abnormal image features for each specific topic. During generation, we use two decoders to first generate coarse sentence topics and then the fine-grained text of each sentence. We directly supervise the intermediate topics using sentence topics learned by our contrastive objective. This strengthens the generation constraint and enables independent fine-tuning of the decoders using reinforcement learning, which further boosts model performance. Experiments on two large-scale datasets MIMIC-CXR and IU-Xray demonstrate that our approach outperforms existing state-of-the-art methods, evaluated by both language generation metrics and clinical accuracy.

摘要

准确的放射学报告自动生成具有重要的临床意义，并且已经引起了越来越多的研究兴趣。然而，由于正常和异常描述之间的不平衡以及放射学报告的多句和多主题性质，这仍然是一项具有挑战性的任务。这些特征给医学图像的准确描述带来了重大挑战，尤其是重要的异常发现。以往解决这些问题的方法严重依赖额外的人工标注，而获取这些标注成本高昂。我们提出了一个多粒度报告生成框架，该框架结合了句子级的图像-句子对比学习，它不需要任何额外的标注，而是能有效地从图像-报告对中学习知识。我们首先将对比学习引入作为图像特征学习的辅助任务。与以往的对比方法不同，我们利用成像报告的多主题性质，通过提取句子主题和内容，并在句子主题引导下对比句子内容和细化后的图像内容，进行细粒度的对比学习。这迫使模型为每个特定主题学习不同的异常图像特征。在生成过程中，我们使用两个解码器，首先生成粗略的句子主题，然后生成每个句子的细粒度文本。我们使用通过对比目标学习到的句子主题直接监督中间主题。这加强了生成约束，并使我们能够使用强化学习对解码器进行独立的微调，从而进一步提高模型性能。在两个大规模数据集MIMIC-CXR和IU-Xray上的实验表明，我们的方法在语言生成指标和临床准确性方面均优于现有的最先进方法。

相似文献

Multi-Grained Radiology Report Generation With Sentence-Level Image-Language Contrastive Learning.基于句子级图像-语言对比学习的多粒度放射学报告生成

IEEE Trans Med Imaging. 2024 Jul;43(7):2657-2669. doi: 10.1109/TMI.2024.3372638. Epub 2024 Jul 1.

Interactive dual-stream contrastive learning for radiology report generation.交互式双流对比学习在放射科报告生成中的应用。

J Biomed Inform. 2024 Sep;157:104718. doi: 10.1016/j.jbi.2024.104718. Epub 2024 Aug 28.

Topicwise Separable Sentence Retrieval for Medical Report Generation.用于医学报告生成的主题可分离句子检索

IEEE Trans Med Imaging. 2025 Mar;44(3):1505-1517. doi: 10.1109/TMI.2024.3507076. Epub 2025 Mar 17.

Comput Methods Programs Biomed. 2025 Jan;258:108482. doi: 10.1016/j.cmpb.2024.108482. Epub 2024 Nov 14.

Denoising Multi-Level Cross-Attention and Contrastive Learning for Chest Radiology Report Generation.用于胸部放射学报告生成的去噪多级交叉注意力和对比学习

J Imaging Inform Med. 2025 Jan 31. doi: 10.1007/s10278-025-01422-9.

Word self-update contrastive adversarial networks for text-to-image synthesis.基于词自更新对比对抗网络的文本到图像合成。

Neural Netw. 2023 Oct;167:433-444. doi: 10.1016/j.neunet.2023.08.038. Epub 2023 Aug 25.

Eye Gaze Guided Cross-Modal Alignment Network for Radiology Report Generation.用于生成放射学报告的眼动引导跨模态对齐网络

IEEE J Biomed Health Inform. 2024 Dec;28(12):7406-7419. doi: 10.1109/JBHI.2024.3422168. Epub 2024 Dec 5.

RadBERT-CL: Factually-Aware Contrastive Learning For Radiology Report Classification.RadBERT-CL：用于放射学报告分类的事实感知对比学习

Proc Mach Learn Res. 2021 Dec;158:196-208.

MKCL: Medical Knowledge with Contrastive Learning model for radiology report generation.MKCL：用于放射学报告生成的具有对比学习模型的医学知识

J Biomed Inform. 2023 Oct;146:104496. doi: 10.1016/j.jbi.2023.104496. Epub 2023 Sep 11.

Information extraction from multi-institutional radiology reports.从多机构放射学报告中提取信息。

Artif Intell Med. 2016 Jan;66:29-39. doi: 10.1016/j.artmed.2015.09.007. Epub 2015 Oct 3.

引用本文的文献

Multi-modal transformer architecture for medical image analysis and automated report generation.多模态转换器架构在医学图像分析和自动报告生成中的应用。

Sci Rep. 2024 Aug 20;14(1):19281. doi: 10.1038/s41598-024-69981-5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于句子级图像-语言对比学习的多粒度放射学报告生成

Multi-Grained Radiology Report Generation With Sentence-Level Image-Language Contrastive Learning.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献