• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用热启动提高胸部 X 光报告生成

Improving chest X-ray report generation by leveraging warm starting.

机构信息

The Australian e-Health Research Centre, CSIRO Health and Biosecurity, Brisbane, Australia.

The Australian e-Health Research Centre, CSIRO Health and Biosecurity, Brisbane, Australia.

出版信息

Artif Intell Med. 2023 Oct;144:102633. doi: 10.1016/j.artmed.2023.102633. Epub 2023 Aug 19.

DOI:10.1016/j.artmed.2023.102633
PMID:37783533
Abstract

Automatically generating a report from a patient's Chest X-rays (CXRs) is a promising solution to reducing clinical workload and improving patient care. However, current CXR report generators-which are predominantly encoder-to-decoder models-lack the diagnostic accuracy to be deployed in a clinical setting. To improve CXR report generation, we investigate warm starting the encoder and decoder with recent open-source computer vision and natural language processing checkpoints, such as the Vision Transformer (ViT) and PubMedBERT. To this end, each checkpoint is evaluated on the MIMIC-CXR and IU X-ray datasets. Our experimental investigation demonstrates that the Convolutional vision Transformer (CvT) ImageNet-21K and the Distilled Generative Pre-trained Transformer 2 (DistilGPT2) checkpoints are best for warm starting the encoder and decoder, respectively. Compared to the state-of-the-art (M Transformer Progressive), CvT2DistilGPT2 attained an improvement of 8.3% for CE F-1, 1.8% for BLEU-4, 1.6% for ROUGE-L, and 1.0% for METEOR. The reports generated by CvT2DistilGPT2 have a higher similarity to radiologist reports than previous approaches. This indicates that leveraging warm starting improves CXR report generation. Code and checkpoints for CvT2DistilGPT2 are available at https://github.com/aehrc/cvt2distilgpt2.

摘要

从患者的胸部 X 光片(CXRs)自动生成报告是减少临床工作量和改善患者护理的有前途的解决方案。然而,目前的 CXR 报告生成器——主要是编码器-解码器模型——缺乏在临床环境中部署的诊断准确性。为了提高 CXR 报告生成的质量,我们研究了使用最近的开源计算机视觉和自然语言处理检查点(如 Vision Transformer(ViT)和 PubMedBERT)来预热编码器和解码器。为此,我们在 MIMIC-CXR 和 IU X-ray 数据集上评估了每个检查点。我们的实验研究表明,卷积视觉 Transformer(CvT)ImageNet-21K 和蒸馏生成预训练 Transformer 2(DistilGPT2)检查点分别是预热编码器和解码器的最佳选择。与最先进的(M Transformer Progressive)相比,CvT2DistilGPT2 在 CE F-1 上提高了 8.3%,在 BLEU-4 上提高了 1.8%,在 ROUGE-L 上提高了 1.6%,在 METEOR 上提高了 1.0%。CvT2DistilGPT2 生成的报告与放射科医生的报告具有更高的相似性,优于以前的方法。这表明利用预热可以提高 CXR 报告生成的质量。CvT2DistilGPT2 的代码和检查点可在 https://github.com/aehrc/cvt2distilgpt2 上获得。

相似文献

1
Improving chest X-ray report generation by leveraging warm starting.利用热启动提高胸部 X 光报告生成
Artif Intell Med. 2023 Oct;144:102633. doi: 10.1016/j.artmed.2023.102633. Epub 2023 Aug 19.
2
Multi-modal transformer architecture for medical image analysis and automated report generation.多模态转换器架构在医学图像分析和自动报告生成中的应用。
Sci Rep. 2024 Aug 20;14(1):19281. doi: 10.1038/s41598-024-69981-5.
3
Translating medical image to radiological report: Adaptive multilevel multi-attention approach.将医学图像翻译为放射报告:自适应多级多关注方法。
Comput Methods Programs Biomed. 2022 Jun;221:106853. doi: 10.1016/j.cmpb.2022.106853. Epub 2022 May 4.
4
Utilizing Longitudinal Chest X-Rays and Reports to Pre-fill Radiology Reports.利用胸部纵向X光片及报告预填充放射学报告。
Med Image Comput Comput Assist Interv. 2023 Oct;14224:189-198. doi: 10.1007/978-3-031-43904-9_19. Epub 2023 Oct 1.
5
CADxReport: Chest x-ray report generation using co-attention mechanism and reinforcement learning.CADxReport:使用协同注意力机制和强化学习生成胸部 X 光报告。
Comput Biol Med. 2022 Jun;145:105498. doi: 10.1016/j.compbiomed.2022.105498. Epub 2022 Apr 15.
6
MuSiC-ViT: A multi-task Siamese convolutional vision transformer for differentiating change from no-change in follow-up chest radiographs.MuSiC-ViT:一种用于区分随访胸部 X 光片上变化与无变化的多任务暹罗卷积视觉Transformer。
Med Image Anal. 2023 Oct;89:102894. doi: 10.1016/j.media.2023.102894. Epub 2023 Jul 12.
7
Cross Encoder-Decoder Transformer with Global-Local Visual Extractor for Medical Image Captioning.交叉编解码器-解码器转换器与全局-局部视觉提取器用于医学图像字幕。
Sensors (Basel). 2022 Feb 13;22(4):1429. doi: 10.3390/s22041429.
8
RadioBERT: A deep learning-based system for medical report generation from chest X-ray images using contextual embeddings.RadioBERT:一种基于深度学习的系统,用于使用上下文嵌入从胸部 X 光图像生成医学报告。
J Biomed Inform. 2022 Nov;135:104220. doi: 10.1016/j.jbi.2022.104220. Epub 2022 Oct 10.
9
Contrastive pre-training and linear interaction attention-based transformer for universal medical reports generation.用于通用医学报告生成的对比预训练和基于线性交互注意力的变压器
J Biomed Inform. 2023 Feb;138:104281. doi: 10.1016/j.jbi.2023.104281. Epub 2023 Jan 10.
10
Utilizing Longitudinal Chest X-Rays and Reports to Pre-Fill Radiology Reports.利用胸部纵向X光片及报告预填放射学报告。
ArXiv. 2023 Oct 10:arXiv:2306.08749v2.

引用本文的文献

1
IHRAS: Automated Medical Report Generation from Chest X-Rays via Classification, Segmentation, and LLMs.IHRAS:通过分类、分割和大语言模型从胸部X光生成自动化医学报告
Bioengineering (Basel). 2025 Jul 24;12(8):795. doi: 10.3390/bioengineering12080795.
2
Turkish Chest X-Ray Report Generation Model Using the Swin Enhanced Yield Transformer (Model-SEY) Framework.使用Swin增强产量变换器(Model-SEY)框架的土耳其胸部X光报告生成模型
Diagnostics (Basel). 2025 Jul 17;15(14):1805. doi: 10.3390/diagnostics15141805.
3
Clinical applications of large language models in medicine and surgery: A scoping review.
大型语言模型在医学与外科中的临床应用:一项范围综述
J Int Med Res. 2025 Jul;53(7):3000605251347556. doi: 10.1177/03000605251347556. Epub 2025 Jul 4.
4
Using AI to Translate and Simplify Spanish Orthopedic Medical Text: Instrument Validation Study.使用人工智能翻译和简化西班牙语骨科医学文本:仪器验证研究。
JMIR AI. 2025 Mar 21;4:e70222. doi: 10.2196/70222.
5
Efficiency and Quality of Generative AI-Assisted Radiograph Reporting.生成式人工智能辅助X线片报告的效率与质量
JAMA Netw Open. 2025 Jun 2;8(6):e2513921. doi: 10.1001/jamanetworkopen.2025.13921.
6
Designing a computer-assisted diagnosis system for cardiomegaly detection and radiology report generation.设计用于心脏肥大检测和放射学报告生成的计算机辅助诊断系统。
PLOS Digit Health. 2025 May 20;4(5):e0000835. doi: 10.1371/journal.pdig.0000835. eCollection 2025 May.
7
Multimodal generative AI for medical image interpretation.用于医学图像解读的多模态生成式人工智能。
Nature. 2025 Mar;639(8056):888-896. doi: 10.1038/s41586-025-08675-y. Epub 2025 Mar 26.
8
Towards a holistic framework for multimodal LLM in 3D brain CT radiology report generation.迈向用于3D脑CT放射学报告生成的多模态大语言模型的整体框架。
Nat Commun. 2025 Mar 6;16(1):2258. doi: 10.1038/s41467-025-57426-0.
9
Collaboration between clinicians and vision-language models in radiology report generation.临床医生与视觉语言模型在放射学报告生成中的协作。
Nat Med. 2025 Feb;31(2):599-608. doi: 10.1038/s41591-024-03302-1. Epub 2024 Nov 7.
10
Large Language Models in Biomedical and Health Informatics: A Review with Bibliometric Analysis.生物医学与健康信息学中的大语言模型:文献计量分析综述
J Healthc Inform Res. 2024 Sep 14;8(4):658-711. doi: 10.1007/s41666-024-00171-8. eCollection 2024 Dec.