Shen Zuowei, Yang Haizhao, Zhang Shijun
Department of Mathematics, Purdue University, West Lafayette, IN 47907, USA,
Neural Comput. 2021 Mar 26;33(4):1005-1036. doi: 10.1162/neco_a_01364.
A new network with super-approximation power is introduced. This network is built with Floor (⌊x⌋) or ReLU (max{0,x}) activation function in each neuron; hence, we call such networks Floor-ReLU networks. For any hyperparameters N∈N+ and L∈N+, we show that Floor-ReLU networks with width max{d,5N+13} and depth 64dL+3 can uniformly approximate a Hölder function f on [0,1]d with an approximation error 3λdα/2N-αL, where α∈(0,1] and λ are the Hölder order and constant, respectively. More generally for an arbitrary continuous function f on [0,1]d with a modulus of continuity ωf(·), the constructive approximation rate is ωf(dN-L)+2ωf(d)N-L. As a consequence, this new class of networks overcomes the curse of dimensionality in approximation power when the variation of ωf(r) as r→0 is moderate (e.g., ωf(r)≲rα for Hölder continuous functions), since the major term to be considered in our approximation rate is essentially d times a function of N and L independent of d within the modulus of continuity.
引入了一种具有超逼近能力的新网络。该网络在每个神经元中使用向下取整函数(⌊x⌋)或ReLU函数(max{0, x})作为激活函数;因此,我们称此类网络为向下取整-ReLU网络。对于任意超参数N∈N +和L∈N +,我们证明了宽度为max{d, 5N + 13}且深度为64dL + 3的向下取整-ReLU网络能够以3λdα / 2N - αL的逼近误差一致逼近[0, 1]d上的Hölder函数f,其中α∈(0, 1]且λ分别为Hölder阶数和常数。更一般地,对于[0, 1]d上具有连续性模ωf(·)的任意连续函数f,构造性逼近率为ωf(dN - L) + 2ωf(d)N - L。因此,当ωf(r)随r→0的变化适中时(例如,对于Hölder连续函数,ωf(r)≲rα),这类新网络在逼近能力上克服了维度诅咒,因为在我们的逼近率中要考虑的主要项本质上是d乘以一个与d无关的N和L的函数,且在连续性模内。