用于三层神经网络的部分BFGS更新及有效步长计算

Partial BFGS update and efficient step-length calculation for three-layer neural networks.

作者信息

Saito K, Nakano R

机构信息

NTT Communication Science Laboratories, Kyoto, Japan.

出版信息

Neural Comput. 1997 Jan 1;9(1):123-41. doi: 10.1162/neco.1997.9.1.123.

DOI:10.1162/neco.1997.9.1.123

PMID:9117895

Abstract

Second-order learning algorithms based on quasi-Newton methods have two problems. First, standard quasi-Newton methods are impractical for large-scale problems because they require N2 storage space to maintain an approximation to an inverse Hessian matrix (N is the number of weights). Second, a line search to calculate a reasonably accurate step length is indispensable for these algorithms. In order to provide desirable performance, an efficient and reasonably accurate line search is needed. To overcome these problems, we propose a new second-order learning algorithm. Descent direction is calculated on the basis of a partial Broydon-Fletcher-Goldfarb-Shanno (BFGS) update with 2Ns memory space (s < < N), and a reasonably accurate step length is efficiently calculated as the minimal point of a second-order approximation to the objective function with respect to the step length. Our experiments, which use a parity problem and a speech synthesis problem, have shown that the proposed algorithm outperformed major learning algorithms. Moreover, it turned out that an efficient and accurate step-length calculation plays an important role for the convergence of quasi-Newton algorithms, and a partial BFGS update greatly saves storage space without losing the convergence performance.

摘要

基于拟牛顿法的二阶学习算法存在两个问题。首先，标准拟牛顿法对于大规模问题不实用，因为它们需要(N^2)的存储空间来维持对逆海森矩阵的近似（(N)是权重的数量）。其次，对于这些算法而言，进行线搜索以计算合理准确的步长是必不可少的。为了提供理想的性能，需要一种高效且合理准确的线搜索。为克服这些问题，我们提出了一种新的二阶学习算法。下降方向是基于具有(2Ns)内存空间（(s << N)）的部分布罗伊登 - 弗莱彻 - 戈德法布 - 香农（BFGS）更新来计算的，并且合理准确的步长被高效地计算为目标函数关于步长的二阶近似的最小点。我们使用奇偶问题和语音合成问题进行的实验表明，所提出的算法优于主要的学习算法。此外，结果表明高效且准确的步长计算对于拟牛顿算法的收敛起着重要作用，并且部分BFGS更新在不损失收敛性能的情况下极大地节省了存储空间。

相似文献

Partial BFGS update and efficient step-length calculation for three-layer neural networks.用于三层神经网络的部分BFGS更新及有效步长计算

Neural Comput. 1997 Jan 1;9(1):123-41. doi: 10.1162/neco.1997.9.1.123.

Fast Quasi-Newton Algorithms for Penalized Reconstruction in Emission Tomography and Further Improvements via Preconditioning.基于正则化重建的发射断层成像中的快速拟牛顿算法及预处理的进一步改进。

IEEE Trans Med Imaging. 2018 Apr;37(4):1000-1010. doi: 10.1109/TMI.2017.2786865.

Second order gradient ascent pulse engineering.二阶梯度上升脉冲工程。

J Magn Reson. 2011 Oct;212(2):412-7. doi: 10.1016/j.jmr.2011.07.023. Epub 2011 Aug 4.

Subsampled Hessian Newton Methods for Supervised Learning.用于监督学习的子采样海森牛顿法

Neural Comput. 2015 Aug;27(8):1766-95. doi: 10.1162/NECO_a_00751. Epub 2015 Jun 16.

Efficient calculation of the Gauss-Newton approximation of the Hessian matrix in neural networks.高效计算神经网络中 Hessian 矩阵的 Gauss-Newton 逼近。

Neural Comput. 2012 Mar;24(3):607-10. doi: 10.1162/NECO_a_00248. Epub 2011 Dec 14.

Adaptive CL-BFGS Algorithms for Complex-Valued Neural Networks.用于复值神经网络的自适应CL-BFGS算法

IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):6313-6327. doi: 10.1109/TNNLS.2021.3135553. Epub 2023 Sep 1.

Efficient training algorithms for a class of shunting inhibitory convolutional neural networks.一类分流抑制卷积神经网络的高效训练算法

IEEE Trans Neural Netw. 2005 May;16(3):541-56. doi: 10.1109/TNN.2005.845144.

Stochastic quasi-Newton molecular simulations.随机拟牛顿分子模拟

Phys Rev E Stat Nonlin Soft Matter Phys. 2010 Aug;82(2 Pt 2):026705. doi: 10.1103/PhysRevE.82.026705. Epub 2010 Aug 24.

An Accelerated Linearly Convergent Stochastic L-BFGS Algorithm.一种加速线性收敛的随机L-BFGS算法。

IEEE Trans Neural Netw Learn Syst. 2019 Nov;30(11):3338-3346. doi: 10.1109/TNNLS.2019.2891088. Epub 2019 Jan 25.

Economical quasi-Newton unitary optimization of electronic orbitals.电子轨道的经济拟牛顿酉优化

Phys Chem Chem Phys. 2024 Feb 22;26(8):6557-6573. doi: 10.1039/d3cp05557d.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于三层神经网络的部分BFGS更新及有效步长计算

Partial BFGS update and efficient step-length calculation for three-layer neural networks.

作者信息

机构信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献