Department of Computer Science, University of California, Irvine, United States of America.
Department of Computer Science and Engineering, University of South Carolina, United States of America.
Neural Netw. 2021 Aug;140:1-12. doi: 10.1016/j.neunet.2021.02.023. Epub 2021 Mar 4.
We introduce SPLASH units, a class of learnable activation functions shown to simultaneously improve the accuracy of deep neural networks while also improving their robustness to adversarial attacks. SPLASH units have both a simple parameterization and maintain the ability to approximate a wide range of non-linear functions. SPLASH units are: (1) continuous; (2) grounded (f(0)=0); (3) use symmetric hinges; and (4) their hinges are placed at fixed locations which are derived from the data (i.e. no learning required). Compared to nine other learned and fixed activation functions, including ReLU and its variants, SPLASH units show superior performance across three datasets (MNIST, CIFAR-10, and CIFAR-100) and four architectures (LeNet5, All-CNN, ResNet-20, and Network-in-Network). Furthermore, we show that SPLASH units significantly increase the robustness of deep neural networks to adversarial attacks. Our experiments on both black-box and white-box adversarial attacks show that commonly-used architectures, namely LeNet5, All-CNN, Network-in-Network, and ResNet-20, can be up to 31% more robust to adversarial attacks by simply using SPLASH units instead of ReLUs. Finally, we show the benefits of using SPLASH activation functions in bigger architectures designed for non-trivial datasets such as ImageNet.
我们引入了 SPLASH 单元,这是一类可学习的激活函数,被证明可以同时提高深度神经网络的准确性,同时提高其对对抗攻击的鲁棒性。SPLASH 单元具有简单的参数化,并且能够近似广泛的非线性函数。SPLASH 单元具有以下特点:(1)连续;(2)有界(f(0)=0);(3)使用对称铰链;(4)铰链位于从数据中导出的固定位置(即无需学习)。与其他九个学习和固定激活函数(包括 ReLU 及其变体)相比,SPLASH 单元在三个数据集(MNIST、CIFAR-10 和 CIFAR-100)和四个架构(LeNet5、All-CNN、ResNet-20 和 Network-in-Network)上表现出更好的性能。此外,我们还表明,SPLASH 单元显著提高了深度神经网络对对抗攻击的鲁棒性。我们在黑盒和白盒对抗攻击上的实验表明,常见的架构,即 LeNet5、All-CNN、Network-in-Network 和 ResNet-20,通过简单地使用 SPLASH 单元代替 ReLUs,对对抗攻击的鲁棒性可以提高 31%。最后,我们展示了在更大的架构中使用 SPLASH 激活函数的好处,这些架构是为非平凡数据集(如 ImageNet)设计的。