College of Science, China University of Petroleum, Qingdao, 266580, China.
School of Mathematics, Southeast University, Nanjing, 211189, China.
Neural Netw. 2019 Jul;115:50-64. doi: 10.1016/j.neunet.2019.02.011. Epub 2019 Mar 26.
Conjugate gradient method has been verified to be one effective strategy for training neural networks due to its low memory requirements and fast convergence. In this paper, we propose an efficient conjugate gradient method to train fully complex-valued network models in terms of Wirtinger differential operator. Two ways are adopted to enhance the training performance. One is to construct a sufficient descent direction during training by designing a fine tuning conjugate coefficient. Another technique is to pursue the optimal learning rate instead of a fixed constant in each iteration which is determined by employing a generalized Armijo search. In addition, we rigorously prove its weak and strong convergence results, i.e., the gradient norms of objective function with respect to weights approach zero along with the increasing iterations and the weight sequence tends to the optimal point. To verify the effectiveness and rationality of the proposed method, four illustrated simulations have been performed on both typical regression and classification problems.
共轭梯度法由于其低内存需求和快速收敛的特点,已被验证为训练神经网络的一种有效策略。在本文中,我们提出了一种基于 Wirtinger 微分算子的高效共轭梯度法来训练全复数值网络模型。我们采用了两种方法来提高训练性能。一种是通过设计精细调整共轭系数来在训练过程中构建充分下降方向。另一种技术是在每次迭代中追求最优学习率,而不是固定常数,这是通过采用广义 Armijo 搜索来确定的。此外,我们还严格证明了其弱和强收敛性,即目标函数相对于权重的梯度范数随着迭代次数的增加而趋近于零,并且权重序列趋于最优解。为了验证所提出方法的有效性和合理性,我们在典型的回归和分类问题上进行了四个仿真实验。