具有不精确海森矩阵、梯度和函数的非凸问题的随机优化

Stochastic Optimization for Nonconvex Problem With Inexact Hessian Matrix, Gradient, and Function.

作者信息

Liu Liu, Liu Xuanqing, Hsieh Cho-Jui, Tao Dacheng

出版信息

IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):1651-1663. doi: 10.1109/TNNLS.2023.3326177. Epub 2025 Jan 7.

DOI:10.1109/TNNLS.2023.3326177

Abstract

Trust region (TR) and adaptive regularization using cubics (ARC) have proven to have some very appealing theoretical properties for nonconvex optimization by concurrently computing function value, gradient, and Hessian matrix to obtain the next search direction and the adjusted parameters. Although stochastic approximations help largely reduce the computational cost, it is challenging to theoretically guarantee the convergence rate. In this article, we explore a family of stochastic TR (STR) and stochastic ARC (SARC) methods that can simultaneously provide inexact computations of the Hessian matrix, gradient, and function values. Our algorithms require much fewer propagations overhead per iteration than TR and ARC. We prove that the iteration complexity to achieve -approximate second-order optimality is of the same order as the exact computations demonstrated in previous studies. In addition, the mild conditions on inexactness can be met by leveraging a random sampling technology in the finite-sum minimization problem. Numerical experiments with a nonconvex problem support these findings and demonstrate that, with the same or a similar number of iterations, our algorithms require less computational overhead per iteration than current second-order methods.

摘要

信赖域（TR）和基于三次函数的自适应正则化（ARC）已被证明在非凸优化中具有一些非常吸引人的理论性质，它们通过同时计算函数值、梯度和海森矩阵来获得下一个搜索方向和调整参数。尽管随机近似在很大程度上有助于降低计算成本，但从理论上保证收敛速度具有挑战性。在本文中，我们探索了一类随机信赖域（STR）和随机ARC（SARC）方法，它们可以同时提供海森矩阵、梯度和函数值的不精确计算。我们的算法在每次迭代中所需的传播开销比TR和ARC少得多。我们证明，实现近似二阶最优性的迭代复杂度与先前研究中精确计算的阶数相同。此外，通过在有限和最小化问题中利用随机采样技术，可以满足关于不精确性的温和条件。一个非凸问题的数值实验支持了这些发现，并表明在相同或相似的迭代次数下，我们的算法在每次迭代中所需的计算开销比当前的二阶方法少。

相似文献

Stochastic Optimization for Nonconvex Problem With Inexact Hessian Matrix, Gradient, and Function.

IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):1651-1663. doi: 10.1109/TNNLS.2023.3326177. Epub 2025 Jan 7.

OPTIMAL COMPUTATIONAL AND STATISTICAL RATES OF CONVERGENCE FOR SPARSE NONCONVEX LEARNING PROBLEMS.

Ann Stat. 2014;42(6):2164-2201. doi: 10.1214/14-AOS1238.

Communication-efficient distributed cubic Newton with compressed lazy Hessian.

Neural Netw. 2024 Jun;174:106212. doi: 10.1016/j.neunet.2024.106212. Epub 2024 Feb 27.

Faster Stochastic Quasi-Newton Methods.

IEEE Trans Neural Netw Learn Syst. 2022 Sep;33(9):4388-4397. doi: 10.1109/TNNLS.2021.3056947. Epub 2022 Aug 31.

ϵ-Approximation of Adaptive Leaning Rate Optimization Algorithms for Constrained Nonconvex Stochastic Optimization.

IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):8108-8115. doi: 10.1109/TNNLS.2022.3142726. Epub 2023 Oct 6.

Shuffling-type gradient method with bandwidth-based step sizes for finite-sum optimization.

Neural Netw. 2024 Nov;179:106514. doi: 10.1016/j.neunet.2024.106514. Epub 2024 Jul 6.

Scalable Proximal Jacobian Iteration Method With Global Convergence Analysis for Nonconvex Unconstrained Composite Optimizations.

IEEE Trans Neural Netw Learn Syst. 2019 Sep;30(9):2825-2839. doi: 10.1109/TNNLS.2018.2885699. Epub 2019 Jan 15.

Efficient algorithm for nonconvex minimization and its application to PM regularization.

IEEE Trans Image Process. 2012 Oct;21(10):4322-33. doi: 10.1109/TIP.2012.2208979. Epub 2012 Jul 16.

AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization.

Comput Intell Neurosci. 2021 Nov 10;2021:5790608. doi: 10.1155/2021/5790608. eCollection 2021.

Hessian-Aided Random Perturbation (HARP) Using Noisy Zeroth-Order Oracles.

IEEE Trans Neural Netw Learn Syst. 2023 Jul;34(7):3717-3726. doi: 10.1109/TNNLS.2021.3117999. Epub 2023 Jul 6.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

具有不精确海森矩阵、梯度和函数的非凸问题的随机优化

Stochastic Optimization for Nonconvex Problem With Inexact Hessian Matrix, Gradient, and Function.

作者信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献