Yao Qingsong, He Zecheng, Li Yuexiang, Lin Yi, Ma Kai, Zheng Yefeng, Kevin Zhou S
IEEE Trans Med Imaging. 2023 Nov 23;PP. doi: 10.1109/TMI.2023.3335098.
Deep learning based methods for medical images can be easily compromised by adversarial examples (AEs), posing a great security flaw in clinical decision-making. It has been discovered that conventional adversarial attacks like PGD which optimize the classification logits, are easy to distinguish in the feature space, resulting in accurate reactive defenses. To better understand this phenomenon and reassess the reliability of the reactive defenses for medical AEs, we thoroughly investigate the characteristic of conventional medical AEs. Specifically, we first theoretically prove that conventional adversarial attacks change the outputs by continuously optimizing vulnerable features in a fixed direction, thereby leading to outlier representations in the feature space. Then, a stress test is conducted to reveal the vulnerability of medical images, by comparing with natural images. Interestingly, this vulnerability is a double-edged sword, which can be exploited to hide AEs. We then propose a simple-yet-effective hierarchical feature constraint (HFC), a novel add-on to conventional white-box attacks, which assists to hide the adversarial feature in the target feature distribution. The proposed method is evaluated on three medical datasets, both 2D and 3D, with different modalities. The experimental results demonstrate the superiority of HFC, i.e., it bypasses an array of state-of-the-art adversarial medical AE detectors more efficiently than competing adaptive attacks, which reveals the deficiencies of medical reactive defense and allows to develop more robust defenses in future.
基于深度学习的医学图像方法很容易受到对抗样本(AE)的影响,这在临床决策中构成了巨大的安全漏洞。人们发现,像PGD这样优化分类对数的传统对抗攻击在特征空间中很容易被区分,从而产生准确的反应式防御。为了更好地理解这一现象并重新评估针对医学对抗样本的反应式防御的可靠性,我们深入研究了传统医学对抗样本的特征。具体来说,我们首先从理论上证明,传统对抗攻击通过在固定方向上持续优化易受攻击的特征来改变输出,从而在特征空间中导致异常表示。然后,通过与自然图像进行比较,进行压力测试以揭示医学图像的脆弱性。有趣的是,这种脆弱性是一把双刃剑,可被用来隐藏对抗样本。接着,我们提出了一种简单而有效的分层特征约束(HFC),这是对传统白盒攻击的一种新颖补充,有助于将对抗特征隐藏在目标特征分布中。我们在三个具有不同模态的二维和三维医学数据集上对所提出的方法进行了评估。实验结果证明了HFC的优越性,即它比竞争性的自适应攻击更有效地绕过了一系列先进的对抗医学对抗样本检测器,这揭示了医学反应式防御的不足,并有助于在未来开发更强大的防御措施。