Rashidisabet Homa, Chan R V Paul, Leiderman Yannek I, Vajaranant Thasarat Sutabutr, Yi Darvin
Department of Biomedical Engineering, University of Illinois Chicago, Chicago, IL, USA.
Artificial Intelligence in Ophthalmology (Ai-O) Center, University of Illinois Chicago, Chicago, IL, USA.
Transl Vis Sci Technol. 2025 Jun 2;14(6):3. doi: 10.1167/tvst.14.6.3.
PURPOSE: Standard deep learning (DL) models often suffer significant performance degradation on out-of-distribution (OOD) data, where test data differs from training data, a common challenge in medical imaging due to real-world variations. METHODS: We propose a unified self-censorship framework as an alternative to the standard DL models for glaucoma classification using deep evidential uncertainty quantification. Our approach detects OOD samples at both the dataset and image levels. Dataset-level self-censorship enables users to accept or reject predictions for an entire new dataset based on model uncertainty, whereas image-level self-censorship refrains from making predictions on individual OOD images rather than risking incorrect classifications. We validated our approach across diverse datasets. RESULTS: Our dataset-level self-censorship method outperforms the standard DL model in OOD detection, achieving an average 11.93% higher area under the curve (AUC) across 14 OOD datasets. Similarly, our image-level self-censorship model improves glaucoma classification accuracy by an average of 17.22% across 4 external glaucoma datasets against baselines while censoring 28.25% more data. CONCLUSIONS: Our approach addresses the challenge of generalization in standard DL models for glaucoma classification across diverse datasets by selectively withholding predictions when the model is uncertain. This method reduces misclassification errors compared to state-of-the-art baselines, particularly for OOD cases. TRANSLATIONAL RELEVANCE: This study introduces a tunable framework that explores the trade-off between prediction accuracy and data retention in glaucoma prediction. By managing uncertainty in model outputs, the approach lays a foundation for future decision support tools aimed at improving the reliability of automated glaucoma diagnosis.
目的:标准深度学习(DL)模型在分布外(OOD)数据上通常会出现显著的性能下降,即测试数据与训练数据不同,这是医学成像中由于现实世界的变化而面临的一个常见挑战。 方法:我们提出了一个统一的自我审查框架,作为使用深度证据不确定性量化进行青光眼分类的标准DL模型的替代方案。我们的方法在数据集和图像级别检测OOD样本。数据集级别的自我审查使用户能够基于模型不确定性接受或拒绝整个新数据集的预测,而图像级别的自我审查则避免对单个OOD图像进行预测,以免出现错误分类。我们在不同的数据集上验证了我们的方法。 结果:我们的数据集级自我审查方法在OOD检测方面优于标准DL模型,在14个OOD数据集上的曲线下面积(AUC)平均高出11.93%。同样,我们的图像级自我审查模型在4个外部青光眼数据集上相对于基线平均提高了17.22%的青光眼分类准确率,同时审查了多28.25%的数据。 结论:我们的方法通过在模型不确定时选择性地 withheld预测,解决了标准DL模型在不同数据集上进行青光眼分类时的泛化挑战。与最先进的基线相比,这种方法减少了错误分类误差,特别是对于OOD情况。 转化相关性:本研究引入了一个可调框架,探索青光眼预测中预测准确性和数据保留之间的权衡。通过管理模型输出中的不确定性,该方法为未来旨在提高自动青光眼诊断可靠性的决策支持工具奠定了基础。
Transl Vis Sci Technol. 2025-6-2
Comput Med Imaging Graph. 2025-1
J Med Imaging (Bellingham). 2023-9
BMC Med Inform Decis Mak. 2019-7-17
Transl Vis Sci Technol. 2023-11-1
Nat Commun. 2023-10-24
Radiology. 2023-8
Front Med (Lausanne). 2022-9-29
Phys Med Biol. 2022-7-27
Clin Ophthalmol. 2022-3-11