Saito K, Nakano R
NTT Communication Science Laboratories, Kyoto, Japan.
Neural Comput. 1997 Jan 1;9(1):123-41. doi: 10.1162/neco.1997.9.1.123.
Second-order learning algorithms based on quasi-Newton methods have two problems. First, standard quasi-Newton methods are impractical for large-scale problems because they require N2 storage space to maintain an approximation to an inverse Hessian matrix (N is the number of weights). Second, a line search to calculate a reasonably accurate step length is indispensable for these algorithms. In order to provide desirable performance, an efficient and reasonably accurate line search is needed. To overcome these problems, we propose a new second-order learning algorithm. Descent direction is calculated on the basis of a partial Broydon-Fletcher-Goldfarb-Shanno (BFGS) update with 2Ns memory space (s < < N), and a reasonably accurate step length is efficiently calculated as the minimal point of a second-order approximation to the objective function with respect to the step length. Our experiments, which use a parity problem and a speech synthesis problem, have shown that the proposed algorithm outperformed major learning algorithms. Moreover, it turned out that an efficient and accurate step-length calculation plays an important role for the convergence of quasi-Newton algorithms, and a partial BFGS update greatly saves storage space without losing the convergence performance.
基于拟牛顿法的二阶学习算法存在两个问题。首先,标准拟牛顿法对于大规模问题不实用,因为它们需要(N^2)的存储空间来维持对逆海森矩阵的近似((N)是权重的数量)。其次,对于这些算法而言,进行线搜索以计算合理准确的步长是必不可少的。为了提供理想的性能,需要一种高效且合理准确的线搜索。为克服这些问题,我们提出了一种新的二阶学习算法。下降方向是基于具有(2Ns)内存空间((s << N))的部分布罗伊登 - 弗莱彻 - 戈德法布 - 香农(BFGS)更新来计算的,并且合理准确的步长被高效地计算为目标函数关于步长的二阶近似的最小点。我们使用奇偶问题和语音合成问题进行的实验表明,所提出的算法优于主要的学习算法。此外,结果表明高效且准确的步长计算对于拟牛顿算法的收敛起着重要作用,并且部分BFGS更新在不损失收敛性能的情况下极大地节省了存储空间。