Department of Statistics and Actuarial Science, The University of Hong Kong, Pokfulam Road, Hong Kong.
Corporate Model Risk, Wells Fargo, USA.
Neural Netw. 2021 Jul;139:149-157. doi: 10.1016/j.neunet.2021.02.014. Epub 2021 Feb 27.
Network initialization is the first and critical step for training neural networks. In this paper, we propose a novel network initialization scheme based on the celebrated Stein's identity. By viewing multi-layer feedforward sigmoidal neural networks as cascades of multi-index models, the projection weights to the first hidden layer are initialized using eigenvectors of the cross-moment matrix between the input's second-order score function and the response. The input data is then forward propagated to the next layer and such a procedure can be repeated until all the hidden layers are initialized. Finally, the weights for the output layer are initialized by generalized linear modeling. Such a proposed SteinGLM method is shown through extensive numerical results to be much faster and more accurate than other popular methods commonly used for training neural networks.
网络初始化是训练神经网络的第一步,也是关键步骤。在本文中,我们提出了一种基于著名的 Stein 恒等式的新型网络初始化方案。通过将多层前馈 sigmoidal 神经网络视为多索引模型的级联,使用输入二阶得分函数和响应之间的交叉矩矩阵的特征向量来初始化到第一层隐藏层的投影权重。然后将输入数据正向传播到下一层,并且可以重复此过程,直到初始化所有隐藏层。最后,通过广义线性建模初始化输出层的权重。通过广泛的数值结果表明,所提出的 SteinGLM 方法比其他常用的训练神经网络的方法更快、更准确。