• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

更快的随机拟牛顿法

Faster Stochastic Quasi-Newton Methods.

作者信息

Zhang Qingsong, Huang Feihu, Deng Cheng, Huang Heng

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Sep;33(9):4388-4397. doi: 10.1109/TNNLS.2021.3056947. Epub 2022 Aug 31.

DOI:10.1109/TNNLS.2021.3056947
PMID:33667166
Abstract

Stochastic optimization methods have become a class of popular optimization tools in machine learning. Especially, stochastic gradient descent (SGD) has been widely used for machine learning problems, such as training neural networks, due to low per-iteration computational complexity. In fact, the Newton or quasi-newton (QN) methods leveraging the second-order information are able to achieve a better solution than the first-order methods. Thus, stochastic QN (SQN) methods have been developed to achieve a better solution efficiently than the stochastic first-order methods by utilizing approximate second-order information. However, the existing SQN methods still do not reach the best known stochastic first-order oracle (SFO) complexity. To fill this gap, we propose a novel faster stochastic QN method (SpiderSQN) based on the variance reduced technique of SIPDER. We prove that our SpiderSQN method reaches the best known SFO complexity of O(n+nϵ) in the finite-sum setting to obtain an ϵ -first-order stationary point. To further improve its practical performance, we incorporate SpiderSQN with different momentum schemes. Moreover, the proposed algorithms are generalized to the online setting, and the corresponding SFO complexity of O(ϵ) is developed, which also matches the existing best result. Extensive experiments on benchmark data sets demonstrate that our new algorithms outperform state-of-the-art approaches for nonconvex optimization.

摘要

随机优化方法已成为机器学习中一类流行的优化工具。特别是,随机梯度下降(SGD)由于每次迭代的计算复杂度较低,已被广泛应用于机器学习问题,如训练神经网络。事实上,利用二阶信息的牛顿法或拟牛顿法(QN)能够比一阶方法获得更好的解。因此,随机拟牛顿法(SQN)已被开发出来,通过利用近似二阶信息,比随机一阶方法更有效地获得更好的解。然而,现有的SQN方法仍未达到最知名的随机一阶神谕(SFO)复杂度。为了填补这一空白,我们基于SIPDER的方差缩减技术提出了一种新颖的更快随机拟牛顿法(SpiderSQN)。我们证明,在有限和设置下,我们的SpiderSQN方法达到了O(n + nϵ)的最知名SFO复杂度,以获得一个ϵ-一阶驻点。为了进一步提高其实际性能,我们将SpiderSQN与不同的动量方案相结合。此外,所提出的算法被推广到在线设置,并开发了相应的O(ϵ)的SFO复杂度,这也与现有的最佳结果相匹配。在基准数据集上的大量实验表明,我们的新算法在非凸优化方面优于现有方法。

相似文献

1
Faster Stochastic Quasi-Newton Methods.更快的随机拟牛顿法
IEEE Trans Neural Netw Learn Syst. 2022 Sep;33(9):4388-4397. doi: 10.1109/TNNLS.2021.3056947. Epub 2022 Aug 31.
2
Asynchronous Parallel Stochastic Quasi-Newton Methods.异步并行随机拟牛顿法
Parallel Comput. 2021 Apr;101. doi: 10.1016/j.parco.2020.102721. Epub 2020 Nov 4.
3
Painless Stochastic Conjugate Gradient for Large-Scale Machine Learning.用于大规模机器学习的无痛随机共轭梯度法
IEEE Trans Neural Netw Learn Syst. 2024 Oct;35(10):14645-14658. doi: 10.1109/TNNLS.2023.3280826. Epub 2024 Oct 7.
4
Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity.具有更低函数查询复杂度的非凸零阶随机交替方向乘子法
IEEE Trans Pattern Anal Mach Intell. 2024 Aug 14;PP. doi: 10.1109/TPAMI.2023.3347082.
5
Faster First-Order Methods for Stochastic Non-Convex Optimization on Riemannian Manifolds.黎曼流形上随机非凸优化的更快一阶方法
IEEE Trans Pattern Anal Mach Intell. 2021 Feb;43(2):459-472. doi: 10.1109/TPAMI.2019.2933841. Epub 2021 Jan 8.
6
AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization.AdaCN:一种用于非凸随机优化的自适应三次牛顿方法。
Comput Intell Neurosci. 2021 Nov 10;2021:5790608. doi: 10.1155/2021/5790608. eCollection 2021.
7
Gradient Descent Ascent for Minimax Problems on Riemannian Manifolds.黎曼流形上最小最大化问题的梯度上升法。
IEEE Trans Pattern Anal Mach Intell. 2023 Jul;45(7):8466-8476. doi: 10.1109/TPAMI.2023.3234160. Epub 2023 Jun 5.
8
Preconditioned Stochastic Gradient Descent.预处理随机梯度下降。
IEEE Trans Neural Netw Learn Syst. 2018 May;29(5):1454-1466. doi: 10.1109/TNNLS.2017.2672978. Epub 2017 Mar 9.
9
Riemannian gradient methods for stochastic composition problems.随机组合问题的黎曼梯度方法。
Neural Netw. 2022 Sep;153:224-234. doi: 10.1016/j.neunet.2022.06.004. Epub 2022 Jun 11.
10
Parallel nonlinear optimization techniques for training neural networks.用于训练神经网络的并行非线性优化技术。
IEEE Trans Neural Netw. 2003;14(6):1460-8. doi: 10.1109/TNN.2003.820670.