Suppr超能文献

模仿:临床先验引导的分层视觉语言预训练

IMITATE: Clinical Prior Guided Hierarchical Vision-Language Pre-Training.

作者信息

Liu Che, Cheng Sibo, Shi Miaojing, Shah Anand, Bai Wenjia, Arcucci Rossella

出版信息

IEEE Trans Med Imaging. 2025 Jan;44(1):519-529. doi: 10.1109/TMI.2024.3449690. Epub 2025 Jan 2.

Abstract

In medical Vision-Language Pre-training (VLP), significant work focuses on extracting text and image features from clinical reports and medical images. Yet, existing methods may overlooked the potential of the natural hierarchical structure in clinical reports, typically divided into 'findings' for description and 'impressions' for conclusions. Current VLP approaches tend to oversimplify these reports into a single entity or fragmented tokens, ignoring this structured format. In this work, we propose a novel clinical prior guided VLP framework named IMITATE to learn the structure information from medical reports with hierarchical vision-language alignment. The framework derives multi-level visual features from the chest X-ray (CXR) images and separately aligns these features with the descriptive and the conclusive text encoded in the hierarchical medical report. Furthermore, a new clinical-informed contrastive loss is introduced for cross-modal learning, which accounts for clinical prior knowledge in formulating sample correlations in contrastive learning. The proposed model, IMITATE, outperforms baseline VLP methods across six different datasets, spanning five medical imaging downstream tasks. Experimental results show benefits of using hierarchical structures in medical reports for VLP. Code: https://github.com/cheliu-computation/IMITATE-TMI2024.

摘要

在医学视觉语言预训练(VLP)中,大量工作聚焦于从临床报告和医学图像中提取文本和图像特征。然而,现有方法可能忽略了临床报告中自然层次结构的潜力,临床报告通常分为用于描述的“发现”和用于结论的“印象”。当前的VLP方法倾向于将这些报告过度简化为单个实体或碎片化的令牌,而忽略了这种结构化格式。在这项工作中,我们提出了一种名为IMITATE的新型临床先验引导VLP框架,以通过层次化视觉语言对齐从医学报告中学习结构信息。该框架从胸部X光(CXR)图像中导出多级视觉特征,并将这些特征分别与分层医学报告中编码的描述性文本和结论性文本对齐。此外,还引入了一种新的临床信息对比损失用于跨模态学习,该损失在对比学习中制定样本相关性时考虑了临床先验知识。所提出的模型IMITATE在跨越五个医学成像下游任务的六个不同数据集上优于基线VLP方法。实验结果表明在医学报告中使用层次结构进行VLP的好处。代码:https://github.com/cheliu-computation/IMITATE-TMI2024

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验