Zavatone-Veth Jacob A, Pehlevan Cengiz
Department of Physics, Harvard University, Cambridge, Massachusetts 02138, USA.
John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, USA.
Phys Rev E. 2021 Feb;103(2):L020301. doi: 10.1103/PhysRevE.103.L020301.
The expressive power of artificial neural networks crucially depends on the nonlinearity of their activation functions. Though a wide variety of nonlinear activation functions have been proposed for use in artificial neural networks, a detailed understanding of their role in determining the expressive power of a network has not emerged. Here, we study how activation functions affect the storage capacity of treelike two-layer networks. We relate the boundedness or divergence of the capacity in the infinite-width limit to the smoothness of the activation function, elucidating the relationship between previously studied special cases. Our results show that nonlinearity can both increase capacity and decrease the robustness of classification, and provide simple estimates for the capacity of networks with several commonly used activation functions. Furthermore, they generate a hypothesis for the functional benefit of dendritic spikes in branched neurons.
人工神经网络的表达能力关键取决于其激活函数的非线性。尽管已经提出了各种各样的非线性激活函数用于人工神经网络,但对于它们在确定网络表达能力中所起作用的详细理解尚未形成。在这里,我们研究激活函数如何影响树状两层网络的存储容量。我们将无限宽度极限下容量的有界性或发散性与激活函数的平滑性联系起来,阐明了先前研究的特殊情况之间的关系。我们的结果表明,非线性既可以增加容量,也可以降低分类的稳健性,并为具有几种常用激活函数的网络容量提供了简单估计。此外,它们还为分支神经元中树突棘的功能益处提出了一个假设。