• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用分布内投票进行分布外检测,以胸部 X 射线分类为例。

Out-of-distribution detection with in-distribution voting using the medical example of chest x-ray classification.

机构信息

Munich Institute of Biomedical Engineering and the School of Computation, Information, and Technology, Technical University of Munich, Munich, Germany.

Institute for History and Ethics in Medicine and Munich School of Technology in Society, Technical University of Munich, Munich, Germany.

出版信息

Med Phys. 2024 Apr;51(4):2721-2732. doi: 10.1002/mp.16790. Epub 2023 Oct 13.

DOI:10.1002/mp.16790
Abstract

BACKGROUND

Deep learning models are being applied to more and more use cases with astonishing success stories, but how do they perform in the real world? Models are typically tested on specific cleaned data sets, but when deployed in the real world, the model will encounter unexpected, out-of-distribution (OOD) data.

PURPOSE

To investigate the impact of OOD radiographs on existing chest x-ray classification models and to increase their robustness against OOD data.

METHODS

The study employed the commonly used chest x-ray classification model, CheXnet, trained on the chest x-ray 14 data set, and tested its robustness against OOD data using three public radiography data sets: IRMA, Bone Age, and MURA, and the ImageNet data set. To detect OOD data for multi-label classification, we proposed in-distribution voting (IDV). The OOD detection performance is measured across data sets using the area under the receiver operating characteristic curve (AUC) analysis and compared with Mahalanobis-based OOD detection, MaxLogit, MaxEnergy, self-supervised OOD detection (SS OOD), and CutMix.

RESULTS

Without additional OOD detection, the chest x-ray classifier failed to discard any OOD images, with an AUC of 0.5. The proposed IDV approach trained on ID (chest x-ray 14) and OOD data (IRMA and ImageNet) achieved, on average, 0.999 OOD AUC across the three data sets, surpassing all other OOD detection methods. Mahalanobis-based OOD detection achieved an average OOD detection AUC of 0.982. IDV trained solely with a few thousand ImageNet images had an AUC 0.913, which was considerably higher than MaxLogit (0.726), MaxEnergy (0.724), SS OOD (0.476), and CutMix (0.376).

CONCLUSIONS

The performance of all tested OOD detection methods did not translate well to radiography data sets, except Mahalanobis-based OOD detection and the proposed IDV method. Consequently, training solely on ID data led to incorrect classification of OOD images as ID, resulting in increased false positive rates. IDV substantially improved the model's ID classification performance, even when trained with data that will not occur in the intended use case or test set (ImageNet), without additional inference overhead or performance decrease in the target classification. The corresponding code is available at https://gitlab.lrz.de/IP/a-knee-cannot-have-lung-disease.

摘要

背景

深度学习模型在越来越多的应用案例中取得了惊人的成功,但它们在现实世界中的表现如何?模型通常在特定的清理数据集上进行测试,但在部署到现实世界时,模型将遇到意想不到的、分布外(OOD)数据。

目的

研究 OOD 射线照片对现有胸部 X 射线分类模型的影响,并提高模型对 OOD 数据的鲁棒性。

方法

该研究采用常用的胸部 X 射线分类模型 CheXnet,在 chest x-ray 14 数据集上进行训练,并使用三个公共射线照片数据集(IRMA、Bone Age 和 MURA)和 ImageNet 数据集测试其对 OOD 数据的鲁棒性。为了对多标签分类进行 OOD 检测,我们提出了分布内投票(IDV)。使用接收器操作特征曲线(AUC)分析在数据集之间测量 OOD 检测性能,并与基于马氏距离的 OOD 检测、MaxLogit、MaxEnergy、自监督 OOD 检测(SS OOD)和 CutMix 进行比较。

结果

在没有额外的 OOD 检测的情况下,胸部 X 射线分类器未能丢弃任何 OOD 图像,AUC 为 0.5。在 ID(chest x-ray 14)和 OOD 数据(IRMA 和 ImageNet)上训练的提议的 IDV 方法在三个数据集上平均实现了 0.999 的 OOD AUC,超过了所有其他 OOD 检测方法。基于马氏距离的 OOD 检测的平均 OOD 检测 AUC 为 0.982。仅使用几千张 ImageNet 图像训练的 IDV 的 AUC 为 0.913,明显高于 MaxLogit(0.726)、MaxEnergy(0.724)、SS OOD(0.476)和 CutMix(0.376)。

结论

除了基于马氏距离的 OOD 检测和提议的 IDV 方法外,所有测试的 OOD 检测方法的性能都不能很好地转化为射线照片数据集。因此,仅在 ID 数据上进行训练会导致 OOD 图像被错误地分类为 ID,从而导致误报率增加。IDV 极大地提高了模型的 ID 分类性能,即使是在训练时使用不会出现在预期用例或测试集中的数据(ImageNet),也不会增加额外的推断开销或目标分类的性能下降。相应的代码可在 https://gitlab.lrz.de/IP/a-knee-cannot-have-lung-disease 上获得。

相似文献

1
Out-of-distribution detection with in-distribution voting using the medical example of chest x-ray classification.使用分布内投票进行分布外检测,以胸部 X 射线分类为例。
Med Phys. 2024 Apr;51(4):2721-2732. doi: 10.1002/mp.16790. Epub 2023 Oct 13.
2
Out-of-Distribution Detection Algorithms for Robust Insect Classification.用于稳健昆虫分类的分布外检测算法
Plant Phenomics. 2024 Apr 30;6:0170. doi: 10.34133/plantphenomics.0170. eCollection 2024.
3
FRODO: An In-Depth Analysis of a System to Reject Outlier Samples From a Trained Neural Network.弗罗多:一种用于从训练好的神经网络中剔除异常样本的系统的深入分析。
IEEE Trans Med Imaging. 2023 Apr;42(4):971-981. doi: 10.1109/TMI.2022.3221898. Epub 2023 Apr 3.
4
CheSS: Chest X-Ray Pre-trained Model via Self-supervised Contrastive Learning.CheSS:基于自监督对比学习的胸部 X 射线预训练模型。
J Digit Imaging. 2023 Jun;36(3):902-910. doi: 10.1007/s10278-023-00782-4. Epub 2023 Jan 26.
5
When an extra rejection class meets out-of-distribution detection in long-tailed image classification.当在长尾图像分类中额外的拒绝类别遇到分布外检测时。
Neural Netw. 2024 Oct;178:106485. doi: 10.1016/j.neunet.2024.106485. Epub 2024 Jun 21.
6
Can input reconstruction be used to directly estimate uncertainty of a dose prediction U-Net model?可否采用输入重建来直接估计剂量预测 U-Net 模型的不确定性?
Med Phys. 2024 Oct;51(10):7369-7377. doi: 10.1002/mp.17287. Epub 2024 Jul 12.
7
MLR-OOD: A Markov Chain Based Likelihood Ratio Method for Out-Of-Distribution Detection of Genomic Sequences.MLR-OOD:基于马尔可夫链的基因组序列分布外检测似然比方法。
J Mol Biol. 2022 Aug 15;434(15):167586. doi: 10.1016/j.jmb.2022.167586. Epub 2022 Apr 12.
8
Investigation of out-of-distribution detection across various models and training methodologies.跨多种模型和训练方法的分布外检测研究。
Neural Netw. 2024 Jul;175:106288. doi: 10.1016/j.neunet.2024.106288. Epub 2024 Apr 4.
9
CheXNet and feature pyramid network: a fusion deep learning architecture for multilabel chest X-Ray clinical diagnoses classification.CheXNet 和特征金字塔网络:一种融合深度学习架构,用于多标签胸部 X 射线临床诊断分类。
Int J Cardiovasc Imaging. 2024 Apr;40(4):709-722. doi: 10.1007/s10554-023-03039-x. Epub 2023 Dec 27.
10
Deep Transfer Learning with Enhanced Feature Fusion for Detection of Abnormalities in X-ray Images.用于X射线图像异常检测的具有增强特征融合的深度迁移学习
Cancers (Basel). 2023 Aug 7;15(15):4007. doi: 10.3390/cancers15154007.

引用本文的文献

1
The Hidden Threat of Hallucinations in Binary Chest X-ray Pneumonia Classification.胸部X光片二元肺炎分类中幻觉的潜在威胁
Proc IEEE Int Symp Comput Based Med Syst. 2025 Jun;2025:668-673. doi: 10.1109/cbms65348.2025.00138. Epub 2025 Jul 4.
2
WindowNet: Learnable Windows for Chest X-ray Classification.WindowNet:用于胸部X光分类的可学习窗口
J Imaging. 2023 Dec 6;9(12):270. doi: 10.3390/jimaging9120270.