Leiden University, Mathematical Institute, Niels Bohrweg 1, 2333 CA Leiden, The Netherlands.
Neural Netw. 2019 Feb;110:232-242. doi: 10.1016/j.neunet.2018.11.005. Epub 2018 Dec 4.
Deep neural networks (DNNs) generate much richer function spaces than shallow networks. Since the function spaces induced by shallow networks have several approximation theoretic drawbacks, this explains, however, not necessarily the success of deep networks. In this article we take another route by comparing the expressive power of DNNs with ReLU activation function to linear spline methods. We show that MARS (multivariate adaptive regression splines) is improper learnable by DNNs in the sense that for any given function that can be expressed as a function in MARS with M parameters there exists a multilayer neural network with O(Mlog(M∕ε)) parameters that approximates this function up to sup-norm error ε. We show a similar result for expansions with respect to the Faber-Schauder system. Based on this, we derive risk comparison inequalities that bound the statistical risk of fitting a neural network by the statistical risk of spline-based methods. This shows that deep networks perform better or only slightly worse than the considered spline methods. We provide a constructive proof for the function approximations.
深度神经网络(DNNs)生成的函数空间比浅层网络丰富得多。由于浅层网络诱导的函数空间具有几个逼近理论上的缺点,但这并不能解释深度网络的成功。在本文中,我们通过将具有 ReLU 激活函数的 DNN 的表达能力与线性样条方法进行比较,走了另一条路。我们表明,在 MARS(多元自适应回归样条)的意义上,DNNs 是不合适可学习的,即对于任何可以表示为具有 M 参数的 MARS 函数的给定函数,都存在一个具有 O(Mlog(M∕ε))参数的多层神经网络,该神经网络可以将这个函数逼近到 sup-norm 误差 ε。我们对 Faber-Schauder 系统的展开也得到了类似的结果。基于此,我们导出了风险比较不等式,这些不等式将神经网络拟合的统计风险与基于样条的方法的统计风险联系起来。这表明深度网络的表现要么更好,要么仅略逊于所考虑的样条方法。我们为函数逼近提供了一个构造性证明。