Zhang Jiajin, Chao Hanqing, Yan Pingkun
IEEE Trans Image Process. 2023;32:1272-1284. doi: 10.1109/TIP.2023.3242141. Epub 2023 Feb 28.
In the past several years, various adversarial training (AT) approaches have been invented to robustify deep learning model against adversarial attacks. However, mainstream AT methods assume the training and testing data are drawn from the same distribution and the training data are annotated. When the two assumptions are violated, existing AT methods fail because either they cannot pass knowledge learnt from a source domain to an unlabeled target domain or they are confused by the adversarial samples in that unlabeled space. In this paper, we first point out this new and challenging problem- adversarial training in unlabeled target domain. We then propose a novel framework named Unsupervised Cross-domain Adversarial Training (UCAT) to address this problem. UCAT effectively leverages the knowledge of the labeled source domain to prevent the adversarial samples from misleading the training process, under the guidance of automatically selected high quality pseudo labels of the unannotated target domain data together with the discriminative and robust anchor representations of the source domain data. The experiments on four public benchmarks show that models trained with UCAT can achieve both high accuracy and strong robustness. The effectiveness of the proposed components is demonstrated through a large set of ablation studies. The source code is publicly available at https://github.com/DIAL-RPI/UCAT.
在过去几年中,人们发明了各种对抗训练(AT)方法,以使深度学习模型对对抗攻击具有鲁棒性。然而,主流的AT方法假设训练数据和测试数据来自相同的分布,并且训练数据是有标注的。当这两个假设不成立时,现有的AT方法就会失效,因为它们要么无法将从源域学到的知识传递到未标注的目标域,要么会被未标注空间中的对抗样本所迷惑。在本文中,我们首先指出了这个新的具有挑战性的问题——未标注目标域中的对抗训练。然后,我们提出了一个名为无监督跨域对抗训练(UCAT)的新颖框架来解决这个问题。UCAT在未标注目标域数据自动选择的高质量伪标签以及源域数据的判别性和鲁棒性锚定表示的指导下,有效地利用有标注源域的知识,防止对抗样本误导训练过程。在四个公共基准上的实验表明,使用UCAT训练的模型可以实现高精度和强鲁棒性。通过大量的消融研究证明了所提出组件的有效性。源代码可在https://github.com/DIAL-RPI/UCAT上公开获取。