Johann Radon Institute, Altenberger Straße 69, A-4040 Linz, Austria.
Johann Radon Institute, Altenberger Straße 69, A-4040 Linz, Austria; Faculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz 1, A-1090 Vienna, Austria; Research Platform Data Science @ Uni Vienna, Währinger Straße 29/S6, A-1090 Vienna, Austria.
Neural Netw. 2022 Nov;155:536-550. doi: 10.1016/j.neunet.2022.09.005. Epub 2022 Sep 14.
In this paper we characterize the set of functions that can be represented by infinite width neural networks with RePU activation function max(0,x), when the network coefficients are regularized by an ℓ (quasi)norm. Compared to the more well-known ReLU activation function (which corresponds to p=1), the RePU activation functions exhibit a greater degree of smoothness which makes them preferable in several applications. Our main result shows that such representations are possible for a given function if and only if the function is κ-order Lipschitz and its R-norm is finite. This extends earlier work on this topic that has been restricted to the case of the ReLU activation function and coefficient bounds with respect to the ℓ norm. Since for q<2, ℓ regularizations are known to promote sparsity, our results also shed light on the ability to obtain sparse neural network representations.
在本文中,我们刻画了具有 RePU 激活函数 max(0,x)的无限宽度神经网络可以表示的函数集合,当网络系数通过 ℓ(拟)范数正则化时。与更为人熟知的 ReLU 激活函数(对应于 p=1)相比,RePU 激活函数表现出更大的平滑度,这使得它们在许多应用中更受欢迎。我们的主要结果表明,如果函数是 κ 阶 Lipschitz 函数并且其 R 范数有限,则存在这样的表示。这扩展了之前关于这个主题的工作,这些工作仅限于 ReLU 激活函数和关于 ℓ 范数的系数界的情况。由于对于 q<2,已知 ℓ 正则化促进稀疏性,我们的结果也揭示了获得稀疏神经网络表示的能力。