带约束非光滑凸优化的个体收敛中的动量加速

Momentum Acceleration in the Individual Convergence of Nonsmooth Convex Optimization With Constraints.

作者信息

Tao Wei, Wu Gao-Wei, Tao Qing

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Mar;33(3):1107-1118. doi: 10.1109/TNNLS.2020.3040325. Epub 2022 Feb 28.

DOI:10.1109/TNNLS.2020.3040325

Abstract

Momentum technique has recently emerged as an effective strategy in accelerating convergence of gradient descent (GD) methods and exhibits improved performance in deep learning as well as regularized learning. Typical momentum examples include Nesterov's accelerated gradient (NAG) and heavy-ball (HB) methods. However, so far, almost all the acceleration analyses are only limited to NAG, and a few investigations about the acceleration of HB are reported. In this article, we address the convergence about the last iterate of HB in nonsmooth optimizations with constraints, which we name individual convergence. This question is significant in machine learning, where the constraints are required to impose on the learning structure and the individual output is needed to effectively guarantee this structure while keeping an optimal rate of convergence. Specifically, we prove that HB achieves an individual convergence rate of O(1/√t) , where t is the number of iterations. This indicates that both of the two momentum methods can accelerate the individual convergence of basic GD to be optimal. Even for the convergence of averaged iterates, our result avoids the disadvantages of the previous work in restricting the optimization problem to be unconstrained as well as limiting the performed number of iterations to be predefined. The novelty of convergence analysis presented in this article provides a clear understanding of how the HB momentum can accelerate the individual convergence and reveals more insights about the similarities and differences in getting the averaging and individual convergence rates. The derived optimal individual convergence is extended to regularized and stochastic settings, in which an individual solution can be produced by the projection-based operation. In contrast to the averaged output, the sparsity can be reduced remarkably without sacrificing the theoretical optimal rates. Several real experiments demonstrate the performance of HB momentum strategy.

摘要

动量技术最近已成为加速梯度下降（GD）方法收敛的有效策略，并且在深度学习以及正则化学习中表现出更好的性能。典型的动量示例包括Nesterov加速梯度（NAG）和重球（HB）方法。然而，到目前为止，几乎所有的加速分析都仅局限于NAG，关于HB加速的研究报道较少。在本文中，我们研究了带约束的非光滑优化中HB最后一次迭代的收敛性，我们将其称为个体收敛。这个问题在机器学习中很重要，其中需要对学习结构施加约束，并且需要个体输出以在保持最优收敛速率的同时有效地保证这种结构。具体而言，我们证明HB实现了O(1/√t)的个体收敛速率，其中t是迭代次数。这表明这两种动量方法都可以将基本GD的个体收敛加速到最优。即使对于平均迭代的收敛，我们的结果也避免了先前工作的缺点，即把优化问题限制为无约束以及将执行的迭代次数限制为预定义值。本文提出的收敛性分析的新颖之处在于清楚地理解了HB动量如何加速个体收敛，并揭示了在获得平均收敛速率和个体收敛速率方面的异同的更多见解。导出的最优个体收敛扩展到正则化和随机设置，其中可以通过基于投影的操作产生个体解。与平均输出相比，可以在不牺牲理论最优速率的情况下显著降低稀疏性。几个实际实验证明了HB动量策略的性能。

相似文献

Momentum Acceleration in the Individual Convergence of Nonsmooth Convex Optimization With Constraints.带约束非光滑凸优化的个体收敛中的动量加速

IEEE Trans Neural Netw Learn Syst. 2022 Mar;33(3):1107-1118. doi: 10.1109/TNNLS.2020.3040325. Epub 2022 Feb 28.

The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization.涅斯捷罗夫外推法在非光滑优化个体收敛中的强度

IEEE Trans Neural Netw Learn Syst. 2020 Jul;31(7):2557-2568. doi: 10.1109/TNNLS.2019.2933452. Epub 2019 Sep 2.

Primal Averaging: A New Gradient Evaluation Step to Attain the Optimal Individual Convergence.原始平均化：一种新的梯度评估步骤，以达到最佳的个体收敛。

IEEE Trans Cybern. 2020 Feb;50(2):835-845. doi: 10.1109/TCYB.2018.2874332. Epub 2018 Oct 19.

Stochastic momentum methods for non-convex learning without bounded assumptions.无界假设下非凸学习的随机动量方法。

Neural Netw. 2023 Aug;165:830-845. doi: 10.1016/j.neunet.2023.06.021. Epub 2023 Jun 23.

Adaptive Restart of the Optimized Gradient Method for Convex Optimization.用于凸优化的优化梯度法的自适应重启

J Optim Theory Appl. 2018 Jul;178(1):240-263. doi: 10.1007/s10957-018-1287-4. Epub 2018 May 7.

A Unified Analysis of AdaGrad With Weighted Aggregation and Momentum Acceleration.结合加权聚合与动量加速的AdaGrad统一分析

IEEE Trans Neural Netw Learn Syst. 2024 Oct;35(10):14482-14490. doi: 10.1109/TNNLS.2023.3279381. Epub 2024 Oct 7.

Convergence analysis of AdaBound with relaxed bound functions for non-convex optimization.AdaBound 与松弛边界函数在非凸优化中的收敛性分析。

Neural Netw. 2022 Jan;145:300-307. doi: 10.1016/j.neunet.2021.10.026. Epub 2021 Nov 8.

Convergence Analysis of Distributed Gradient Descent Algorithms With One and Two Momentum Terms.具有一个和两个动量项的分布式梯度下降算法的收敛性分析

IEEE Trans Cybern. 2024 Mar;54(3):1511-1522. doi: 10.1109/TCYB.2022.3218663. Epub 2024 Feb 9.

Distributed Stochastic Proximal Algorithm With Random Reshuffling for Nonsmooth Finite-Sum Optimization.用于非光滑有限和优化的带随机重排的分布式随机近端算法

IEEE Trans Neural Netw Learn Syst. 2024 Mar;35(3):4082-4096. doi: 10.1109/TNNLS.2022.3201711. Epub 2024 Feb 29.

NALA: a Nesterov accelerated look-ahead optimizer for deep learning.NALA：一种用于深度学习的Nesterov加速前瞻优化器。

PeerJ Comput Sci. 2024 Jul 3;10:e2167. doi: 10.7717/peerj-cs.2167. eCollection 2024.

引用本文的文献

QSAR modeling without descriptors using graph convolutional neural networks: the case of mutagenicity prediction.使用图卷积神经网络的无描述符定量构效关系建模：以致突变性预测为例

Mol Divers. 2021 Aug;25(3):1283-1299. doi: 10.1007/s11030-021-10250-2. Epub 2021 Jun 19.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

带约束非光滑凸优化的个体收敛中的动量加速

Momentum Acceleration in the Individual Convergence of Nonsmooth Convex Optimization With Constraints.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献