• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种具有较少调谐工作的多元自适应梯度算法。

A multivariate adaptive gradient algorithm with reduced tuning efforts.

机构信息

School of Electrical Engineering and Computer Engineering, The Pennsylvania State University, State College, PA, 16802, USA.

Department of Electrical Engineering, Stanford University, Stanford, CA, 94305, USA.

出版信息

Neural Netw. 2022 Aug;152:499-509. doi: 10.1016/j.neunet.2022.05.016. Epub 2022 May 21.

DOI:10.1016/j.neunet.2022.05.016
PMID:35640371
Abstract

Large neural networks usually perform well for executing machine learning tasks. However, models that achieve state-of-the-art performance involve arbitrarily large number of parameters and therefore their training is very expensive. It is thus desired to implement methods with small per-iteration costs, fast convergence rates, and reduced tuning. This paper proposes a multivariate adaptive gradient descent method that meets the above attributes. The proposed method updates every element of the model parameters separately in a computationally efficient manner using an adaptive vector-form learning rate, resulting in low per-iteration cost. The adaptive learning rate computes the absolute difference of current and previous model parameters over the difference in subgradients of current and previous state estimates. In the deterministic setting, we show that the cost function value converges at a linear rate for smooth and strongly convex cost functions. Whereas in both the deterministic and stochastic setting, we show that the gradient converges in expectation at the order of O(1/k) for a non-convex cost function with Lipschitz continuous gradient. In addition, we show that after T iterates, the cost function of the last iterate scales as O(log(T)/T) for non-smooth strongly convex cost functions. Effectiveness of the proposed method is validated on convex functions, smooth non-convex function, non-smooth convex function, and four image classification data sets, whilst showing that its execution requires hardly any tuning unlike existing popular optimizers that entail relatively large tuning efforts. Our empirical results show that our proposed algorithm provides the best overall performance when comparing it to tuned state-of-the-art optimizers.

摘要

大型神经网络通常在执行机器学习任务方面表现出色。然而,实现最先进性能的模型涉及任意数量的参数,因此它们的训练非常昂贵。因此,人们希望实现具有小迭代成本、快速收敛速度和减少调优的方法。本文提出了一种满足上述属性的多元自适应梯度下降方法。该方法使用自适应向量形式的学习率以计算效率的方式分别更新模型参数的每个元素,从而降低了迭代成本。自适应学习率通过计算当前和以前状态估计的子梯度之间的差异,计算当前和以前模型参数之间的绝对差异。在确定性设置中,我们表明对于光滑和强凸的成本函数,成本函数值以线性速率收敛。而在确定性和随机设置中,我们表明对于具有 Lipschitz 连续梯度的非凸成本函数,梯度以 O(1/k)的阶数在期望中收敛。此外,我们表明对于非光滑强凸成本函数,经过 T 次迭代后,最后一次迭代的成本函数的规模为 O(log(T)/T)。我们在凸函数、光滑非凸函数、非光滑凸函数以及四个图像分类数据集上验证了所提出方法的有效性,同时表明与需要相对较大调优工作的现有流行优化器不同,它的执行几乎不需要任何调优。我们的实证结果表明,与经过调优的最先进优化器相比,我们提出的算法提供了最佳的整体性能。

相似文献

1
A multivariate adaptive gradient algorithm with reduced tuning efforts.一种具有较少调谐工作的多元自适应梯度算法。
Neural Netw. 2022 Aug;152:499-509. doi: 10.1016/j.neunet.2022.05.016. Epub 2022 May 21.
2
Adaptive Restart of the Optimized Gradient Method for Convex Optimization.用于凸优化的优化梯度法的自适应重启
J Optim Theory Appl. 2018 Jul;178(1):240-263. doi: 10.1007/s10957-018-1287-4. Epub 2018 May 7.
3
Distributed Stochastic Gradient Tracking Algorithm With Variance Reduction for Non-Convex Optimization.用于非凸优化的具有方差缩减的分布式随机梯度跟踪算法
IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5310-5321. doi: 10.1109/TNNLS.2022.3170944. Epub 2023 Sep 1.
4
A novel adaptive momentum method for medical image classification using convolutional neural network.基于卷积神经网络的医学图像分类自适应动量方法
BMC Med Imaging. 2022 Mar 1;22(1):34. doi: 10.1186/s12880-022-00755-z.
5
Online Learning for DNN Training: A Stochastic Block Adaptive Gradient Algorithm.在线深度学习:一种随机块自适应梯度算法。
Comput Intell Neurosci. 2022 Jun 2;2022:9337209. doi: 10.1155/2022/9337209. eCollection 2022.
6
AdaSAM: Boosting sharpness-aware minimization with adaptive learning rate and momentum for training deep neural networks.AdaSAM:通过自适应学习率和动量增强锐度感知最小化以训练深度神经网络
Neural Netw. 2024 Jan;169:506-519. doi: 10.1016/j.neunet.2023.10.044. Epub 2023 Nov 1.
7
Shuffling-type gradient method with bandwidth-based step sizes for finite-sum optimization.基于带宽的步长的有限和优化的洗牌型梯度方法。
Neural Netw. 2024 Nov;179:106514. doi: 10.1016/j.neunet.2024.106514. Epub 2024 Jul 6.
8
Gradient regularization of Newton method with Bregman distances.基于布雷格曼距离的牛顿法梯度正则化
Math Program. 2024;204(1-2):1-25. doi: 10.1007/s10107-023-01943-7. Epub 2023 Mar 24.
9
Selecting the best optimizers for deep learning-based medical image segmentation.为基于深度学习的医学图像分割选择最佳优化器。
Front Radiol. 2023 Sep 21;3:1175473. doi: 10.3389/fradi.2023.1175473. eCollection 2023.
10
A fast saddle-point dynamical system approach to robust deep learning.一种快速鞍点动力学系统方法,用于鲁棒深度学习。
Neural Netw. 2021 Jul;139:33-44. doi: 10.1016/j.neunet.2021.02.021. Epub 2021 Feb 26.

引用本文的文献

1
Early Detection of Lung Nodules Using a Revolutionized Deep Learning Model.使用革新的深度学习模型早期检测肺结节
Diagnostics (Basel). 2023 Nov 20;13(22):3485. doi: 10.3390/diagnostics13223485.
2
A Deep Learning Framework with an Intermediate Layer Using the Swarm Intelligence Optimizer for Diagnosing Oral Squamous Cell Carcinoma.一种具有使用群体智能优化器的中间层的深度学习框架,用于诊断口腔鳞状细胞癌。
Diagnostics (Basel). 2023 Nov 16;13(22):3461. doi: 10.3390/diagnostics13223461.
3
Optimizing Inference Distribution for Efficient Kidney Tumor Segmentation Using a UNet-PWP Deep-Learning Model with XAI on CT Scan Images.
使用带有可解释人工智能的UNet-PWP深度学习模型在CT扫描图像上优化推理分布以实现高效肾肿瘤分割
Diagnostics (Basel). 2023 Oct 18;13(20):3244. doi: 10.3390/diagnostics13203244.
4
An Approach to Binary Classification of Alzheimer's Disease Using LSTM.一种使用长短期记忆网络(LSTM)对阿尔茨海默病进行二元分类的方法。
Bioengineering (Basel). 2023 Aug 9;10(8):950. doi: 10.3390/bioengineering10080950.