Suppr超能文献

胸部X光片二元肺炎分类中幻觉的潜在威胁

The Hidden Threat of Hallucinations in Binary Chest X-ray Pneumonia Classification.

作者信息

Rajaraman Sivaramakrishnan, Liang Zhaohui, Marini Niccolo, Xue Zhiyun, Antani Sameer

机构信息

Division of Intramural Research, National Library of Medicine, National Institutes of Health Bethesda, MD, USA.

出版信息

Proc IEEE Int Symp Comput Based Med Syst. 2025 Jun;2025:668-673. doi: 10.1109/cbms65348.2025.00138. Epub 2025 Jul 4.

Abstract

Hallucination in deep learning (DL) classification, where DL models yield confidently erroneous predictions remains a pressing concern. This study investigates whether binary classifiers are truly learning disease-specific features when distinguishing overlapping radiological presentations among pneumonia subtypes on chest X-ray (CXR) images. Specifically, we evaluate if uncertainty measure is a valuable tool in classifying signs of different pathogen-specific subtypes of pneumonia. We evaluated two binary classifiers to classify bacterial pneumonia and viral pneumonia, respectively, from normal CXRs. A third classifier explored the ability to distinguish bacterial from viral pneumonia presentation to highlight our concern regarding the observed hallucinations in the former cases. Our comprehensive analysis computes the Matthews Correlation Coefficient and prediction entropy metrics on a pediatric CXR dataset and reveals that the normal/bacterial and normal/viral classifiers consistently and confidently misclassify the unseen pneumonia subtype to their respective disease class. These findings expose a critical limitation concerning the tendency of binary classifiers to hallucinate by relying on general pneumonia indicators rather than pathogen-specific patterns, thereby challenging their utility in clinical workflows.

摘要

深度学习(DL)分类中的幻觉现象,即DL模型产生置信度高但错误的预测,仍然是一个紧迫的问题。本研究调查了二分类器在区分胸部X光(CXR)图像上肺炎亚型之间重叠的放射学表现时,是否真的在学习疾病特异性特征。具体而言,我们评估不确定性度量是否是一种有价值的工具,用于对不同病原体特异性肺炎亚型的体征进行分类。我们评估了两个二分类器,分别从正常的CXR图像中对细菌性肺炎和病毒性肺炎进行分类。第三个分类器探索了区分细菌性肺炎和病毒性肺炎表现的能力,以突出我们对前一种情况下观察到的幻觉现象的担忧。我们的综合分析在一个儿科CXR数据集上计算了马修斯相关系数和预测熵指标,结果显示正常/细菌性和正常/病毒性分类器持续且自信地将未见过的肺炎亚型误分类到各自的疾病类别中。这些发现揭示了一个关键局限性,即二分类器倾向于依靠一般肺炎指标而非病原体特异性模式产生幻觉,从而挑战了它们在临床工作流程中的实用性。

相似文献

1
The Hidden Threat of Hallucinations in Binary Chest X-ray Pneumonia Classification.胸部X光片二元肺炎分类中幻觉的潜在威胁
Proc IEEE Int Symp Comput Based Med Syst. 2025 Jun;2025:668-673. doi: 10.1109/cbms65348.2025.00138. Epub 2025 Jul 4.
7
Thoracic imaging tests for the diagnosis of COVID-19.用于 COVID-19 诊断的胸部影像学检查。
Cochrane Database Syst Rev. 2022 May 16;5(5):CD013639. doi: 10.1002/14651858.CD013639.pub5.
9
[Guidelines for the prevention and management of bronchial asthma (2024 edition)].[支气管哮喘防治指南(2024年版)]
Zhonghua Jie He He Hu Xi Za Zhi. 2025 Mar 12;48(3):208-248. doi: 10.3760/cma.j.cn112147-20241013-00601.

本文引用的文献

1
Ensembled YOLO for multiorgan detection in chest x-rays.用于胸部X光多器官检测的集成YOLO
Proc SPIE Int Soc Opt Eng. 2025 Feb;13407. doi: 10.1117/12.3047210. Epub 2025 Apr 4.
7
GC: Generalizable Continual Classification of Medical Images.GC:医学图像的可推广连续分类。
IEEE Trans Med Imaging. 2024 Nov;43(11):3767-3779. doi: 10.1109/TMI.2024.3398533. Epub 2024 Nov 4.
9
Epistemic uncertainty in Bayesian predictive probabilities.贝叶斯预测概率中的认知不确定性。
J Biopharm Stat. 2024 May;34(3):394-412. doi: 10.1080/10543406.2023.2204943. Epub 2023 May 8.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验