IEEE Trans Cybern. 2020 Nov;50(11):4835-4847. doi: 10.1109/TCYB.2019.2914099. Epub 2019 May 17.
Recently, salient object detection has witnessed remarkable improvement owing to the deep convolutional neural networks which can harvest powerful features for images. In particular, the state-of-the-art salient object detection methods enjoy high accuracy and efficiency from fully convolutional network (FCN)-based frameworks which are trained from end to end and predict pixel-wise labels. However, such framework suffers from adversarial attacks which confuse neural networks via adding quasi-imperceptible noises to input images without changing the ground truth annotated by human subjects. To our knowledge, this paper is the first one that mounts successful adversarial attacks on salient object detection models and verifies that adversarial samples are effective on a wide range of existing methods. Furthermore, this paper proposes a novel end-to-end trainable framework to enhance the robustness for arbitrary FCN-based salient object detection models against adversarial attacks. The proposed framework adopts a novel idea that first introduces some new generic noise to destroy adversarial perturbations, and then learns to predict saliency maps for input images with the introduced noise. Specifically, our proposed method consists of a segment-wise shielding component, which preserves boundaries and destroys delicate adversarial noise patterns and a context-aware restoration component, which refines saliency maps through global contrast modeling. The experimental results suggest that our proposed framework improves the performance significantly for state-of-the-art models on a series of datasets.
近年来,由于深度卷积神经网络能够为图像提取强大的特征,显著目标检测技术取得了显著的进展。特别是,基于全卷积网络(FCN)的框架的最新显著目标检测方法从端到端训练中受益,能够实现像素级别的标签预测,具有很高的准确性和效率。然而,这种框架容易受到对抗攻击的影响,这些攻击通过向输入图像添加几乎不可察觉的噪声来混淆神经网络,而不会改变人类标注的真实标签。据我们所知,本文首次对显著目标检测模型进行了成功的对抗攻击,并验证了对抗样本对现有广泛方法的有效性。此外,本文提出了一种新颖的端到端可训练框架,用于增强任意基于 FCN 的显著目标检测模型对对抗攻击的鲁棒性。所提出的框架采用了一种新颖的思想,首先引入一些新的通用噪声来破坏对抗性的干扰,然后学习预测带有引入噪声的输入图像的显著图。具体来说,我们的方法包括一个分段屏蔽组件,它保留边界并破坏精细的对抗性噪声模式,以及一个上下文感知的恢复组件,它通过全局对比度建模来细化显著图。实验结果表明,我们提出的框架在一系列数据集上显著提高了最先进模型的性能。