• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用隐私保护的大型语言模型和多类型标注增强胸部 X 光数据集:一种用于提高分类性能的数据驱动方法。

Enhancing chest X-ray datasets with privacy-preserving large language models and multi-type annotations: A data-driven approach for improved classification.

机构信息

Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bldg 10, Room 1C224D, 10 Center Dr, Bethesda, MD 20892-1182, USA.

Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bldg 10, Room 1C224D, 10 Center Dr, Bethesda, MD 20892-1182, USA.

出版信息

Med Image Anal. 2025 Jan;99:103383. doi: 10.1016/j.media.2024.103383. Epub 2024 Nov 10.

DOI:10.1016/j.media.2024.103383
PMID:39546982
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11609015/
Abstract

In chest X-ray (CXR) image analysis, rule-based systems are usually employed to extract labels from reports for dataset releases. However, there is still room for improvement in label quality. These labelers typically output only presence labels, sometimes with binary uncertainty indicators, which limits their usefulness. Supervised deep learning models have also been developed for report labeling but lack adaptability, similar to rule-based systems. In this work, we present MAPLEZ (Medical report Annotations with Privacy-preserving Large language model using Expeditious Zero shot answers), a novel approach leveraging a locally executable Large Language Model (LLM) to extract and enhance findings labels on CXR reports. MAPLEZ extracts not only binary labels indicating the presence or absence of a finding but also the location, severity, and radiologists' uncertainty about the finding. Over eight abnormalities from five test sets, we show that our method can extract these annotations with an increase of 3.6 percentage points (pp) in macro F1 score for categorical presence annotations and more than 20 pp increase in F1 score for the location annotations over competing labelers. Additionally, using the combination of improved annotations and multi-type annotations in classification supervision in a dataset of limited-resolution CXRs, we demonstrate substantial advancements in proof-of-concept classification quality, with an increase of 1.1 pp in AUROC over models trained with annotations from the best alternative approach. We share code and annotations.

摘要

在胸部 X 光(CXR)图像分析中,通常使用基于规则的系统从报告中提取标签以发布数据集。然而,标签的质量仍然有改进的空间。这些标签器通常仅输出存在标签,有时带有二进制不确定性指标,这限制了它们的用途。也已经为报告标记开发了基于监督的深度学习模型,但与基于规则的系统类似,它们缺乏适应性。在这项工作中,我们提出了 MAPLEZ(利用具有 Expeditious Zero shot answers 的隐私保护大型语言模型进行医学报告标注),这是一种利用本地可执行的大型语言模型(LLM)从 CXR 报告中提取和增强发现标签的新方法。MAPLEZ 不仅提取了指示发现存在或不存在的二进制标签,还提取了发现的位置、严重程度以及放射科医生对发现的不确定性。在五个测试集中的八种异常中,我们表明,我们的方法可以提取这些注释,对于类别存在注释,宏 F1 得分提高了 3.6 个百分点(pp),对于位置注释,F1 得分提高了 20 多个百分点,超过了竞争标签器。此外,在有限分辨率 CXR 数据集的分类监督中使用改进的注释和多类型注释的组合,我们在概念验证分类质量方面取得了实质性的进展,与使用最佳替代方法的注释训练的模型相比,AUROC 提高了 1.1 个百分点。我们共享代码和注释。

相似文献

1
Enhancing chest X-ray datasets with privacy-preserving large language models and multi-type annotations: A data-driven approach for improved classification.利用隐私保护的大型语言模型和多类型标注增强胸部 X 光数据集:一种用于提高分类性能的数据驱动方法。
Med Image Anal. 2025 Jan;99:103383. doi: 10.1016/j.media.2024.103383. Epub 2024 Nov 10.
2
Enhancing chest X-ray datasets with privacy-preserving large language models and multi-type annotations: a data-driven approach for improved classification.利用隐私保护大语言模型和多类型注释增强胸部X光数据集:一种用于改进分类的数据驱动方法。
ArXiv. 2024 Aug 15:arXiv:2403.04024v2.
3
Language model-based labeling of German thoracic radiology reports.基于语言模型的德国胸部放射学报告标注
Rofo. 2025 Jan;197(1):55-64. doi: 10.1055/a-2287-5054. Epub 2024 Apr 25.
4
Multi-Label Chest X-Ray Image Classification With Single Positive Labels.具有单一正标签的多标签胸部X光图像分类
IEEE Trans Med Imaging. 2024 Dec;43(12):4404-4418. doi: 10.1109/TMI.2024.3421644. Epub 2024 Dec 2.
5
German CheXpert Chest X-ray Radiology Report Labeler.德国 CheXpert 胸部 X 射线放射学报告标签生成器。
Rofo. 2024 Sep;196(9):956-965. doi: 10.1055/a-2234-8268. Epub 2024 Jan 31.
6
BarlowTwins-CXR: enhancing chest X-ray abnormality localization in heterogeneous data with cross-domain self-supervised learning.BarlowTwins-CXR:利用跨域自监督学习增强异质数据中胸部 X 光异常定位
BMC Med Inform Decis Mak. 2024 May 16;24(1):126. doi: 10.1186/s12911-024-02529-9.
7
Deep Omni-Supervised Learning for Rib Fracture Detection From Chest Radiology Images.基于胸部放射影像的肋骨骨折检测深度全监督学习
IEEE Trans Med Imaging. 2024 May;43(5):1972-1982. doi: 10.1109/TMI.2024.3353248. Epub 2024 May 2.
8
Comparison of radiologist versus natural language processing-based image annotations for deep learning system for tuberculosis screening on chest radiographs.比较放射科医生与基于自然语言处理的图像标注对胸部 X 光片结核病筛查深度学习系统的影响。
Clin Imaging. 2022 Jul;87:34-37. doi: 10.1016/j.clinimag.2022.04.009. Epub 2022 Apr 25.
9
Better performance of deep learning pulmonary nodule detection using chest radiography with pixel level labels in reference to computed tomography: data quality matters.使用带有像素级标签的胸部 X 光片对深度学习肺结节检测的性能提升:数据质量很重要。
Sci Rep. 2024 Jul 10;14(1):15967. doi: 10.1038/s41598-024-66530-y.
10
Automated Radiology Report Labeling in Chest X-Ray Pathologies: Development and Evaluation of a Large Language Model Framework.胸部X光病理学中的自动放射学报告标注:大语言模型框架的开发与评估
JMIR Med Inform. 2025 Mar 28;13:e68618. doi: 10.2196/68618.

引用本文的文献

1
From large language models to multimodal AI: a scoping review on the potential of generative AI in medicine.从大语言模型到多模态人工智能:关于生成式人工智能在医学领域潜力的范围综述
Biomed Eng Lett. 2025 Aug 22;15(5):845-863. doi: 10.1007/s13534-025-00497-1. eCollection 2025 Sep.
2
Bootstrapping BI-RADS classification using large language models and transformers in breast magnetic resonance imaging reports.在乳腺磁共振成像报告中使用大语言模型和变换器进行自训练乳腺影像报告和数据系统(BI-RADS)分类
Vis Comput Ind Biomed Art. 2025 Apr 3;8(1):8. doi: 10.1186/s42492-025-00189-8.
3
AXpert: human expert facilitated privacy-preserving large language models for abdominal X-ray report labeling.AXpert:由人类专家辅助的用于腹部X光报告标注的隐私保护大语言模型。
JAMIA Open. 2025 Feb 10;8(1):ooaf008. doi: 10.1093/jamiaopen/ooaf008. eCollection 2025 Feb.

本文引用的文献

1
A New Benchmark: Clinical Uncertainty and Severity Aware Labeled Chest X-Ray Images With Multi-Relationship Graph Learning.一个新的基准:具有多关系图学习的临床不确定性和严重程度感知的标注胸部X光图像
IEEE Trans Med Imaging. 2025 Jan;44(1):338-347. doi: 10.1109/TMI.2024.3441494. Epub 2025 Jan 2.
2
Feasibility of Using the Privacy-preserving Large Language Model Vicuna for Labeling Radiology Reports.使用隐私保护的大型语言模型 Vicuna 对放射科报告进行标注的可行性研究。
Radiology. 2023 Oct;309(1):e231147. doi: 10.1148/radiol.231147.
3
Leveraging GPT-4 for Post Hoc Transformation of Free-text Radiology Reports into Structured Reporting: A Multilingual Feasibility Study.利用GPT-4将自由文本放射学报告进行事后转换为结构化报告:一项多语言可行性研究。
Radiology. 2023 May;307(4):e230725. doi: 10.1148/radiol.230725. Epub 2023 Apr 4.
4
BioGPT: generative pre-trained transformer for biomedical text generation and mining.BioGPT:用于生物医学文本生成和挖掘的生成式预训练转换器。
Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac409.
5
Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning.通过自监督学习对未经注释的胸部 X 光图像中的病理学进行专家级检测。
Nat Biomed Eng. 2022 Dec;6(12):1399-1406. doi: 10.1038/s41551-022-00936-9. Epub 2022 Sep 15.
6
REFLACX, a dataset of reports and eye-tracking data for localization of abnormalities in chest x-rays.REFLACX,一个包含报告和眼动数据的数据集,用于定位胸部 X 光片中的异常。
Sci Data. 2022 Jun 18;9(1):350. doi: 10.1038/s41597-022-01441-z.
7
Deep Reinforcement Learning with Automated Label Extraction from Clinical Reports Accurately Classifies 3D MRI Brain Volumes.基于临床报告自动提取标签的深度学习准确分类 3D MRI 脑容量。
J Digit Imaging. 2022 Oct;35(5):1143-1152. doi: 10.1007/s10278-022-00644-5. Epub 2022 May 13.
8
Labeling Noncontrast Head CT Reports for Common Findings Using Natural Language Processing.基于自然语言处理的常见头部 CT 无对比剂报告标注
AJNR Am J Neuroradiol. 2022 May;43(5):721-726. doi: 10.3174/ajnr.A7500. Epub 2022 Apr 28.
9
Multi-label annotation of text reports from computed tomography of the chest, abdomen, and pelvis using deep learning.使用深度学习对胸部、腹部和骨盆计算机断层扫描的文本报告进行多标签标注。
BMC Med Inform Decis Mak. 2022 Apr 15;22(1):102. doi: 10.1186/s12911-022-01843-4.
10
Detection of Pneumothorax with Deep Learning Models: Learning From Radiologist Labels vs Natural Language Processing Model Generated Labels.深度学习模型检测气胸:从放射科医生标签与自然语言处理模型生成标签中学习。
Acad Radiol. 2022 Sep;29(9):1350-1358. doi: 10.1016/j.acra.2021.09.013. Epub 2021 Oct 12.