用于大规模机器学习的无痛随机共轭梯度法

Painless Stochastic Conjugate Gradient for Large-Scale Machine Learning.

作者信息

Yang Zhuang

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Oct;35(10):14645-14658. doi: 10.1109/TNNLS.2023.3280826. Epub 2024 Oct 7.

DOI:10.1109/TNNLS.2023.3280826

Abstract

Conjugate gradient (CG), as an effective technique to speed up gradient descent algorithms, has shown great potential and has widely been used for large-scale machine-learning problems. However, CG and its variants have not been devised for the stochastic setting, which makes them extremely unstable, and even leads to divergence when using noisy gradients. This article develops a novel class of stable stochastic CG (SCG) algorithms with a faster convergence rate via the variance-reduced technique and an adaptive step size rule in the mini-batch setting. Actually, replacing the use of a line search in the CG-type approaches which is time-consuming, or even fails for SCG, this article considers using the random stabilized Barzilai-Borwein (RSBB) method to obtain an online step size. We rigorously analyze the convergence properties of the proposed algorithms and show that the proposed algorithms attain a linear convergence rate for both the strongly convex and nonconvex settings. Also, we show that the total complexity of the proposed algorithms matches that of modern stochastic optimization algorithms under different cases. Scores of numerical experiments on machine-learning problems demonstrate that the proposed algorithms outperform state-of-the-art stochastic optimization algorithms.

摘要

共轭梯度（CG）作为一种加速梯度下降算法的有效技术，已显示出巨大潜力，并广泛应用于大规模机器学习问题。然而，CG及其变体尚未针对随机设置设计，这使得它们极其不稳定，甚至在使用有噪声的梯度时会导致发散。本文通过在小批量设置中采用方差缩减技术和自适应步长规则，开发了一类收敛速度更快的新型稳定随机共轭梯度（SCG）算法。实际上，由于在CG型方法中使用线搜索既耗时，甚至对SCG而言还会失败，本文考虑使用随机稳定的Barzilai-Borwein（RSBB）方法来获得在线步长。我们严格分析了所提出算法的收敛性质，并表明所提出的算法在强凸和非凸设置下均达到线性收敛速度。此外，我们表明所提出算法的总复杂度在不同情况下与现代随机优化算法相匹配。在机器学习问题上进行的大量数值实验表明，所提出的算法优于现有最先进的随机优化算法。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于大规模机器学习的无痛随机共轭梯度法

Painless Stochastic Conjugate Gradient for Large-Scale Machine Learning.

作者信息

出版信息

相似文献

引用本文的文献

用于大规模机器学习的无痛随机共轭梯度法

Painless Stochastic Conjugate Gradient for Large-Scale Machine Learning.

作者信息

出版信息

相似文献

引用本文的文献