Hua Guang, Teoh Andrew Beng Jin, Xiang Yong, Jiang Hao
IEEE Trans Neural Netw Learn Syst. 2024 Aug;35(8):11204-11217. doi: 10.1109/TNNLS.2023.3250210. Epub 2024 Aug 5.
The unprecedented success of deep learning could not be achieved without the synergy of big data, computing power, and human knowledge, among which none is free. This calls for the copyright protection of deep neural networks (DNNs), which has been tackled via DNN watermarking. Due to the special structure of DNNs, backdoor watermarks have been one of the popular solutions. In this article, we first present a big picture of DNN watermarking scenarios with rigorous definitions unifying the black- and white-box concepts across watermark embedding, attack, and verification phases. Then, from the perspective of data diversity, especially adversarial and open set examples overlooked in the existing works, we rigorously reveal the vulnerability of backdoor watermarks against black-box ambiguity attacks. To solve this problem, we propose an unambiguous backdoor watermarking scheme via the design of deterministically dependent trigger samples and labels, showing that the cost of ambiguity attacks will increase from the existing linear complexity to exponential complexity. Furthermore, noting that the existing definition of backdoor fidelity is solely concerned with classification accuracy, we propose to more rigorously evaluate fidelity via examining training data feature distributions and decision boundaries before and after backdoor embedding. Incorporating the proposed prototype guided regularizer (PGR) and fine-tune all layers (FTAL) strategy, we show that backdoor fidelity can be substantially improved. Experimental results using two versions of the basic ResNet18, advanced wide residual network (WRN28_10) and EfficientNet-B0, on MNIST, CIFAR-10, CIFAR-100, and FOOD-101 classification tasks, respectively, illustrate the advantages of the proposed method.
深度学习取得的空前成功离不开大数据、计算能力和人类知识的协同作用,而这些都不是免费可得的。这就需要对深度神经网络(DNN)进行版权保护,而这一问题已通过DNN水印技术得以解决。由于DNN的特殊结构,后门水印一直是一种流行的解决方案。在本文中,我们首先呈现了DNN水印场景的全景,给出了严谨的定义,统一了水印嵌入、攻击和验证阶段的黑盒与白盒概念。然后,从数据多样性的角度,特别是现有工作中被忽视的对抗性和开放集示例,我们严格揭示了后门水印在面对黑盒模糊攻击时的脆弱性。为了解决这个问题,我们通过设计确定性相关的触发样本和标签,提出了一种明确的后门水印方案,表明模糊攻击的成本将从现有的线性复杂度增加到指数复杂度。此外,注意到现有的后门保真度定义仅关注分类准确率,我们建议通过检查后门嵌入前后的训练数据特征分布和决策边界来更严格地评估保真度。结合所提出的原型引导正则化器(PGR)和全层微调(FTAL)策略,我们表明可以显著提高后门保真度。分别在MNIST、CIFAR-10、CIFAR-100和FOOD-101分类任务上使用两个版本的基本ResNet18、高级宽残差网络(WRN28_10)和EfficientNet-B0进行的实验结果,说明了所提方法的优势。