确保医学人工智能安全：基于可解释性的虚假模型行为及相关数据检测与缓解

Ensuring medical AI safety: interpretability-driven detection and mitigation of spurious model behavior and associated data.

作者信息

Pahde Frederik, Wiegand Thomas, Lapuschkin Sebastian, Samek Wojciech

机构信息

Fraunhofer Heinrich Hertz Institute, 10587 Berlin, Germany.

Technische Universität Berlin, 10587 Berlin, Germany.

出版信息

Mach Learn. 2025;114(9):206. doi: 10.1007/s10994-025-06834-w. Epub 2025 Aug 12.

DOI:10.1007/s10994-025-06834-w

PMID:40814399

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12343733/

Abstract

Deep neural networks are increasingly employed in high-stakes medical applications, despite their tendency for shortcut learning in the presence of spurious correlations, which can have potentially fatal consequences in practice. Whereas a multitude of works address either the detection or mitigation of such shortcut behavior in isolation, the Reveal2Revise approach provides a comprehensive bias mitigation framework combining these steps. However, effectively addressing these biases often requires substantial labeling efforts from domain experts. In this work, we review the steps of the Reveal2Revise framework and enhance it with semi-automated interpretability-based bias annotation capabilities. This includes methods for the sample- and feature-level bias annotation, providing valuable information for bias mitigation methods to unlearn the undesired shortcut behavior. We show the applicability of the framework using four medical datasets across two modalities, featuring controlled and real-world spurious correlations caused by data artifacts. We successfully identify and mitigate these biases in VGG16, ResNet50, and contemporary Vision Transformer models, ultimately increasing their robustness and applicability for real-world medical tasks. Our code is available at https://github.com/frederikpahde/medical-ai-safety.

摘要

深度神经网络越来越多地应用于高风险医疗应用中，尽管它们在存在虚假相关性时倾向于走捷径学习，这在实际应用中可能会产生潜在的致命后果。虽然许多工作单独处理这种捷径行为的检测或缓解，但“揭示2修正”方法提供了一个结合这些步骤的全面偏差缓解框架。然而，有效解决这些偏差通常需要领域专家进行大量的标注工作。在这项工作中，我们回顾了“揭示2修正”框架的步骤，并用基于半自动化可解释性的偏差标注功能对其进行了增强。这包括样本级和特征级偏差标注方法，为偏差缓解方法提供有价值的信息，以消除不期望的捷径行为。我们使用跨两种模态的四个医学数据集展示了该框架的适用性，这些数据集具有由数据伪像引起的可控和真实世界的虚假相关性。我们在VGG16、ResNet50和当代视觉Transformer模型中成功识别并缓解了这些偏差，最终提高了它们在实际医疗任务中的鲁棒性和适用性。我们的代码可在https://github.com/frederikpahde/medical-ai-safety获取。