• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过词区域对齐改进胸部X光与放射学报告的联合学习

Improving Joint Learning of Chest X-Ray and Radiology Report by Word Region Alignment.

作者信息

Ji Zhanghexuan, Shaikh Mohammad Abuzar, Moukheiber Dana, Srihari Sargur N, Peng Yifan, Gao Mingchen

机构信息

Department of Computer Science and Engineering, University at Buffalo, The State University of New York, Buffalo, NY, USA.

Population Health Sciences, Weill Cornell Medicine, New York, NY, USA.

出版信息

Mach Learn Med Imaging. 2021 Sep;12966:110-119. doi: 10.1007/978-3-030-87589-3_12. Epub 2021 Sep 21.

DOI:10.1007/978-3-030-87589-3_12
PMID:35647616
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9134785/
Abstract

Self-supervised learning provides an opportunity to explore unlabeled chest X-rays and their associated free-text reports accumulated in clinical routine without manual supervision. This paper proposes a Joint Image Text Representation Learning Network (JoImTeRNet) for pre-training on chest X-ray images and their radiology reports. The model was pre-trained on both the global image-sentence level and the local image region-word level for visual-textual matching. Both are bidirectionally constrained on Cross-Entropy based and ranking-based Triplet Matching Losses. The region-word matching is calculated using the attention mechanism without direct supervision about their mapping. The pre-trained multi-modal representation learning paves the way for downstream tasks concerning image and/or text encoding. We demonstrate the representation learning quality by cross-modality retrievals and multi-label classifications on two datasets: OpenI-IU and MIMIC-CXR. Our code is available at https://github.com/mshaikh2/JoImTeR_MLMI_2021.

摘要

自监督学习提供了一个机会,可以在无需人工监督的情况下,探索临床常规中积累的未标记胸部X光片及其相关的自由文本报告。本文提出了一种联合图像文本表示学习网络(JoImTeRNet),用于对胸部X光图像及其放射学报告进行预训练。该模型在全局图像-句子级别和局部图像区域-单词级别上进行预训练,以实现视觉-文本匹配。两者都基于交叉熵和基于排序的三元组匹配损失进行双向约束。区域-单词匹配是使用注意力机制计算的,无需对其映射进行直接监督。预训练的多模态表示学习为涉及图像和/或文本编码的下游任务铺平了道路。我们通过在两个数据集OpenI-IU和MIMIC-CXR上进行跨模态检索和多标签分类来展示表示学习的质量。我们的代码可在https://github.com/mshaikh2/JoImTeR_MLMI_2021获取。

相似文献

1
Improving Joint Learning of Chest X-Ray and Radiology Report by Word Region Alignment.通过词区域对齐改进胸部X光与放射学报告的联合学习
Mach Learn Med Imaging. 2021 Sep;12966:110-119. doi: 10.1007/978-3-030-87589-3_12. Epub 2021 Sep 21.
2
MLVICX: Multi-Level Variance-Covariance Exploration for Chest X-Ray Self-Supervised Representation Learning.MLVICX:用于胸部X光自监督表征学习的多级方差协方差探索
IEEE J Biomed Health Inform. 2024 Dec;28(12):7480-7490. doi: 10.1109/JBHI.2024.3455337. Epub 2024 Dec 5.
3
Radiology report generation with a learned knowledge base and multi-modal alignment.基于学习知识库和多模态对齐的放射学报告生成
Med Image Anal. 2023 May;86:102798. doi: 10.1016/j.media.2023.102798. Epub 2023 Mar 23.
4
Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment.用于肺水肿评估的胸部X光片与放射学报告的联合建模
Med Image Comput Comput Assist Interv. 2020 Oct;12262:529-539. doi: 10.1007/978-3-030-59713-9_51. Epub 2020 Sep 29.
5
Translating medical image to radiological report: Adaptive multilevel multi-attention approach.将医学图像翻译为放射报告:自适应多级多关注方法。
Comput Methods Programs Biomed. 2022 Jun;221:106853. doi: 10.1016/j.cmpb.2022.106853. Epub 2022 May 4.
6
Utilizing Longitudinal Chest X-Rays and Reports to Pre-Fill Radiology Reports.利用胸部纵向X光片及报告预填放射学报告。
ArXiv. 2023 Oct 10:arXiv:2306.08749v2.
7
A label information fused medical image report generation framework.一种融合标签信息的医学图像报告生成框架。
Artif Intell Med. 2024 Apr;150:102823. doi: 10.1016/j.artmed.2024.102823. Epub 2024 Feb 22.
8
Utilizing Longitudinal Chest X-Rays and Reports to Pre-fill Radiology Reports.利用胸部纵向X光片及报告预填充放射学报告。
Med Image Comput Comput Assist Interv. 2023 Oct;14224:189-198. doi: 10.1007/978-3-031-43904-9_19. Epub 2023 Oct 1.
9
DualAttNet: Synergistic fusion of image-level and fine-grained disease attention for multi-label lesion detection in chest X-rays.双注意力网络:用于胸部 X 光片多标签病变检测的图像级和细粒度疾病注意力协同融合。
Comput Biol Med. 2024 Jan;168:107742. doi: 10.1016/j.compbiomed.2023.107742. Epub 2023 Nov 22.
10
CADxReport: Chest x-ray report generation using co-attention mechanism and reinforcement learning.CADxReport:使用协同注意力机制和强化学习生成胸部 X 光报告。
Comput Biol Med. 2022 Jun;145:105498. doi: 10.1016/j.compbiomed.2022.105498. Epub 2022 Apr 15.

引用本文的文献

1
A Systematic Review and Implementation Guidelines of Multimodal Foundation Models in Medical Imaging.医学影像中多模态基础模型的系统评价与实施指南
Res Sq. 2025 Apr 28:rs.3.rs-5537908. doi: 10.21203/rs.3.rs-5537908/v1.
2
Multimodal contrastive learning for enhanced explainability in pediatric brain tumor molecular diagnosis.用于增强小儿脑肿瘤分子诊断可解释性的多模态对比学习
Sci Rep. 2025 Mar 30;15(1):10943. doi: 10.1038/s41598-025-94806-4.
3
A survey of the impact of self-supervised pretraining for diagnostic tasks in medical X-ray, CT, MRI, and ultrasound.针对医学 X 射线、CT、MRI 和超声诊断任务的自监督预训练的影响进行调查。
BMC Med Imaging. 2024 Apr 6;24(1):79. doi: 10.1186/s12880-024-01253-0.
4
A scoping review on multimodal deep learning in biomedical images and texts.多模态深度学习在生物医学图像和文本中的应用综述
J Biomed Inform. 2023 Oct;146:104482. doi: 10.1016/j.jbi.2023.104482. Epub 2023 Aug 29.
5
That's the Wrong Lung! Evaluating and Improving the Interpretability of Unsupervised Multimodal Encoders for Medical Data.搞错肺了!评估并提高医学数据无监督多模态编码器的可解释性。
Proc Conf Empir Methods Nat Lang Process. 2022 Dec;2022:3626-3648.
6
Self-supervised learning for medical image classification: a systematic review and implementation guidelines.用于医学图像分类的自监督学习:系统综述与实施指南
NPJ Digit Med. 2023 Apr 26;6(1):74. doi: 10.1038/s41746-023-00811-0.
7
Few-Shot Learning Geometric Ensemble for Multi-label Classification of Chest X-Rays.用于胸部X光多标签分类的少样本学习几何集成
Data Augment Label Imperfections (2022). 2022 Sep;13567:112-122. doi: 10.1007/978-3-031-17027-0_12. Epub 2022 Sep 16.

本文引用的文献

1
Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment.用于肺水肿评估的胸部X光片与放射学报告的联合建模
Med Image Comput Comput Assist Interv. 2020 Oct;12262:529-539. doi: 10.1007/978-3-030-59713-9_51. Epub 2020 Sep 29.
2
MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports.MIMIC-CXR,一个去标识化的、公开可用的、包含自由文本报告的胸部 X 光数据库。
Sci Data. 2019 Dec 12;6(1):317. doi: 10.1038/s41597-019-0322-0.
3
Deep Visual-Semantic Alignments for Generating Image Descriptions.深度视觉-语义对齐生成图像描述。
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):664-676. doi: 10.1109/TPAMI.2016.2598339. Epub 2016 Aug 5.
4
Preparing a collection of radiology examinations for distribution and retrieval.准备一批用于分发和检索的放射学检查资料。
J Am Med Inform Assoc. 2016 Mar;23(2):304-10. doi: 10.1093/jamia/ocv080. Epub 2015 Jul 1.