Magoulas G D, Vrahatis M N, Androulakis G S
Department of Informatics, University of Athens, GR-157.71, Athens, Greece.
Neural Comput. 1999 Oct 1;11(7):1769-96. doi: 10.1162/089976699300016223.
This article focuses on gradient-based backpropagation algorithms that use either a common adaptive learning rate for all weights or an individual adaptive learning rate for each weight and apply the Goldstein/Armijo line search. The learning-rate adaptation is based on descent techniques and estimates of the local Lipschitz constant that are obtained without additional error function and gradient evaluations. The proposed algorithms improve the backpropagation training in terms of both convergence rate and convergence characteristics, such as stable learning and robustness to oscillations. Simulations are conducted to compare and evaluate the convergence behavior of these gradient-based training algorithms with several popular training methods.
本文聚焦于基于梯度的反向传播算法,这些算法要么对所有权重使用通用的自适应学习率,要么对每个权重使用单独的自适应学习率,并应用戈德斯坦/阿米霍线搜索。学习率自适应基于下降技术以及在不进行额外误差函数和梯度评估的情况下获得的局部利普希茨常数估计。所提出的算法在收敛速度和收敛特性方面(如稳定学习和对振荡的鲁棒性)改进了反向传播训练。进行了仿真,以比较和评估这些基于梯度的训练算法与几种流行训练方法的收敛行为。