加速小批量随机块坐标下降法

Accelerated Mini-batch Randomized Block Coordinate Descent Method.

作者信息

Zhao Tuo, Yu Mo, Wang Yiming, Arora Raman, Liu Han

机构信息

Johns Hopkins University ; Princeton University.

Harbin Institute of Technology.

出版信息

Adv Neural Inf Process Syst. 2014 Dec;27:5614.

PMID:25620860

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4303186/

Abstract

We consider regularized empirical risk minimization problems. In particular, we minimize the sum of a smooth empirical risk function and a nonsmooth regularization function. When the regularization function is block separable, we can solve the minimization problems in a randomized block coordinate descent (RBCD) manner. Existing RBCD methods usually decrease the objective value by exploiting the partial gradient of a randomly selected block of coordinates in each iteration. Thus they need all data to be accessible so that the partial gradient of the block gradient can be exactly obtained. However, such a "batch" setting may be computationally expensive in practice. In this paper, we propose a mini-batch randomized block coordinate descent (MRBCD) method, which estimates the partial gradient of the selected block based on a mini-batch of randomly sampled data in each iteration. We further accelerate the MRBCD method by exploiting the semi-stochastic optimization scheme, which effectively reduces the variance of the partial gradient estimators. Theoretically, we show that for strongly convex functions, the MRBCD method attains lower overall iteration complexity than existing RBCD methods. As an application, we further trim the MRBCD method to solve the regularized sparse learning problems. Our numerical experiments shows that the MRBCD method naturally exploits the sparsity structure and achieves better computational performance than existing methods.

摘要

我们考虑正则化经验风险最小化问题。具体而言，我们最小化一个光滑经验风险函数与一个非光滑正则化函数的和。当正则化函数是块可分的时，我们可以以随机块坐标下降（RBCD）的方式求解最小化问题。现有的RBCD方法通常在每次迭代中通过利用随机选择的坐标块的部分梯度来降低目标值。因此，它们需要所有数据都可访问，以便能够精确获得块梯度的部分梯度。然而，这种“批量”设置在实际中可能计算成本很高。在本文中，我们提出了一种小批量随机块坐标下降（MRBCD）方法，该方法在每次迭代中基于一小批随机采样的数据来估计所选块的部分梯度。我们通过利用半随机优化方案进一步加速MRBCD方法，这有效地降低了部分梯度估计器的方差。从理论上讲，我们表明对于强凸函数，MRBCD方法比现有的RBCD方法具有更低的总体迭代复杂度。作为一个应用，我们进一步调整MRBCD方法来解决正则化稀疏学习问题。我们的数值实验表明，MRBCD方法自然地利用了稀疏结构，并且比现有方法具有更好的计算性能。

相似文献

Accelerated Mini-batch Randomized Block Coordinate Descent Method.

Adv Neural Inf Process Syst. 2014 Dec;27:5614.

The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization.

IEEE Trans Neural Netw Learn Syst. 2020 Jul;31(7):2557-2568. doi: 10.1109/TNNLS.2019.2933452. Epub 2019 Sep 2.

Efficient Sum of Outer Products Dictionary Learning (SOUP-DIL) and Its Application to Inverse Problems.

IEEE Trans Comput Imaging. 2017 Dec;3(4):694-709. doi: 10.1109/TCI.2017.2697206. Epub 2017 Apr 21.

A Minibatch Proximal Stochastic Recursive Gradient Algorithm Using a Trust-Region-Like Scheme and Barzilai-Borwein Stepsizes.

IEEE Trans Neural Netw Learn Syst. 2021 Oct;32(10):4627-4638. doi: 10.1109/TNNLS.2020.3025383. Epub 2021 Oct 5.

The Proximal Alternating Minimization Algorithm for Two-Block Separable Convex Optimization Problems with Linear Constraints.

J Optim Theory Appl. 2019;182(1):110-132. doi: 10.1007/s10957-018-01454-y. Epub 2018 Dec 24.

A multivariate adaptive gradient algorithm with reduced tuning efforts.

Neural Netw. 2022 Aug;152:499-509. doi: 10.1016/j.neunet.2022.05.016. Epub 2022 May 21.

Gradient regularization of Newton method with Bregman distances.

Math Program. 2024;204(1-2):1-25. doi: 10.1007/s10107-023-01943-7. Epub 2023 Mar 24.

Improving Sparsity and Scalability in Regularized Nonconvex Truncated-Loss Learning Problems.

IEEE Trans Neural Netw Learn Syst. 2018 Jul;29(7):2782-2793. doi: 10.1109/TNNLS.2017.2705429. Epub 2017 Jun 6.

Variance Reduced Methods for Non-Convex Composition Optimization.

IEEE Trans Pattern Anal Mach Intell. 2022 Sep;44(9):5813-5825. doi: 10.1109/TPAMI.2021.3071594. Epub 2022 Aug 4.

Sparse Representation With Spatio-Temporal Online Dictionary Learning for Promising Video Coding.

IEEE Trans Image Process. 2016 Oct;25(10):4580-4595. doi: 10.1109/TIP.2016.2594490. Epub 2016 Jul 27.

引用本文的文献

An Improved Adam Optimization Algorithm Combining Adaptive Coefficients and Composite Gradients Based on Randomized Block Coordinate Descent.

Comput Intell Neurosci. 2023 Jan 10;2023:4765891. doi: 10.1155/2023/4765891. eCollection 2023.

Pharmacovigilance via Baseline Regularization with Large-Scale Longitudinal Observational Data.

KDD. 2017 Aug;2017:1537-1546. doi: 10.1145/3097983.3097998.

Calibrated Multivariate Regression with Application to Neural Semantic Basis Discovery.

J Mach Learn Res. 2015 Aug;16:1579-1606.

Accelerated Path-following Iterative Shrinkage Thresholding Algorithm with Application to Semiparametric Graph Estimation.

J Comput Graph Stat. 2016;25(4):1272-1296. doi: 10.1080/10618600.2016.1164533. Epub 2016 Nov 10.

本文引用的文献

COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION.

Ann Appl Stat. 2011 Jan 1;5(1):232-253. doi: 10.1214/10-AOAS388.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

加速小批量随机块坐标下降法

Accelerated Mini-batch Randomized Block Coordinate Descent Method.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献