Department of Mathematics, University of Central Florida, Orlando, FL 32816, USA.
Neural Netw. 2022 Jun;150:259-273. doi: 10.1016/j.neunet.2022.02.016. Epub 2022 Mar 2.
It has been observed that design choices of neural networks are often crucial for their successful optimization. In this article, we therefore discuss the question if it is always possible to redesign a neural network so that it trains well with gradient descent. This yields the following universality result: If, for a given network, there is any algorithm that can find good network weights for a classification task, then there exists an extension of this network that reproduces the same forward model by mere gradient descent training. The construction is not intended for practical computations, but it provides some orientation on the possibilities of pre-trained networks in meta-learning and related approaches.
已经观察到,神经网络的设计选择对于其成功优化往往至关重要。因此,在本文中,我们讨论了这样一个问题:是否总能重新设计神经网络,使其可以通过梯度下降很好地训练。这就得出了以下普遍性结果:如果对于给定的网络,存在任何算法可以为分类任务找到好的网络权重,那么就存在该网络的扩展版本,仅通过梯度下降训练即可复制相同的前向模型。该构造不是为了实际计算而设计的,但它为元学习和相关方法中的预训练网络的可能性提供了一些方向。