• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习分布式随机优化中的渐近网络独立性

Asymptotic Network Independence in Distributed Stochastic Optimization for Machine Learning.

作者信息

Pu Shi, Olshevsky Alex, Paschalidis Ioannis Ch

机构信息

Institute for Data and Decision Analytics, The Chinese University of Hong Kong, Shenzhen, China and Shenzhen Research Institute of Big Data. The research was conducted when the author was with Division of Systems Engineering, Boston University, Boston, MA.

Department of Electrical and Computer Engineering and Division of Systems Engineering, Boston University, Boston, MA.

出版信息

IEEE Signal Process Mag. 2020 May;37(3):114-122. doi: 10.1109/msp.2020.2975212. Epub 2020 May 6.

DOI:10.1109/msp.2020.2975212
PMID:33746471
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7977622/
Abstract

We provide a discussion of several recent results which, in certain scenarios, are able to overcome a barrier in distributed stochastic optimization for machine learning. Our focus is the so-called asymptotic network independence property, which is achieved whenever a distributed method executed over a network of nodes asymptotically converges to the optimal solution at a comparable rate to a centralized method with the same computational power as the entire network. We explain this property through an example involving the training of ML models and sketch a short mathematical analysis for comparing the performance of distributed stochastic gradient descent (DSGD) with centralized stochastic gradient decent (SGD).

摘要

我们讨论了几个近期的结果,在某些情况下,这些结果能够克服机器学习中分布式随机优化的一个障碍。我们关注的是所谓的渐近网络独立性属性,当在节点网络上执行的分布式方法以与具有与整个网络相同计算能力的集中式方法相当的速率渐近收敛到最优解时,就会实现这一属性。我们通过一个涉及机器学习模型训练的例子来解释这一属性,并简要勾勒一个数学分析,以比较分布式随机梯度下降(DSGD)和集中式随机梯度下降(SGD)的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/83ea/7977622/bf60ad961241/nihms-1590731-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/83ea/7977622/dc8afd2b1ae1/nihms-1590731-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/83ea/7977622/0528ffba4ff6/nihms-1590731-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/83ea/7977622/37ff883ce31b/nihms-1590731-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/83ea/7977622/19ebcc9c0a2c/nihms-1590731-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/83ea/7977622/bf60ad961241/nihms-1590731-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/83ea/7977622/dc8afd2b1ae1/nihms-1590731-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/83ea/7977622/0528ffba4ff6/nihms-1590731-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/83ea/7977622/37ff883ce31b/nihms-1590731-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/83ea/7977622/19ebcc9c0a2c/nihms-1590731-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/83ea/7977622/bf60ad961241/nihms-1590731-f0005.jpg

相似文献

1
Asymptotic Network Independence in Distributed Stochastic Optimization for Machine Learning.机器学习分布式随机优化中的渐近网络独立性
IEEE Signal Process Mag. 2020 May;37(3):114-122. doi: 10.1109/msp.2020.2975212. Epub 2020 May 6.
2
A Sharp Estimate on the Transient Time of Distributed Stochastic Gradient Descent.关于分布式随机梯度下降瞬态时间的精确估计
IEEE Trans Automat Contr. 2022 Nov;67(11):5900-5915. doi: 10.1109/tac.2021.3126253. Epub 2021 Nov 9.
3
Decentralized stochastic sharpness-aware minimization algorithm.去中心化随机锐化感知最小化算法。
Neural Netw. 2024 Aug;176:106325. doi: 10.1016/j.neunet.2024.106325. Epub 2024 Apr 17.
4
Personalized On-Device E-Health Analytics With Decentralized Block Coordinate Descent.基于去中心化块坐标下降的个性化设备端电子健康分析。
IEEE J Biomed Health Inform. 2022 Jun;26(6):2778-2786. doi: 10.1109/JBHI.2022.3140455. Epub 2022 Jun 3.
5
A(DP) SGD: Asynchronous Decentralized Parallel Stochastic Gradient Descent With Differential Privacy.异步去中心化并行随机梯度下降与差分隐私。
IEEE Trans Pattern Anal Mach Intell. 2022 Nov;44(11):8036-8047. doi: 10.1109/TPAMI.2021.3107796. Epub 2022 Oct 4.
6
Robust Asynchronous Stochastic Gradient-Push: Asymptotically Optimal and Network-Independent Performance for Strongly Convex Functions.鲁棒异步随机梯度推送:强凸函数的渐近最优和与网络无关的性能
J Mach Learn Res. 2020;21.
7
The effect of choosing optimizer algorithms to improve computer vision tasks: a comparative study.选择优化器算法对改进计算机视觉任务的影响:一项比较研究。
Multimed Tools Appl. 2023;82(11):16591-16633. doi: 10.1007/s11042-022-13820-0. Epub 2022 Sep 28.
8
Communication-Censored Distributed Stochastic Gradient Descent.通信受限分布式随机梯度下降
IEEE Trans Neural Netw Learn Syst. 2022 Nov;33(11):6831-6843. doi: 10.1109/TNNLS.2021.3083655. Epub 2022 Oct 27.
9
Weighted SGD for ℓ Regression with Randomized Preconditioning.用于带随机预处理的ℓ回归的加权随机梯度下降法。
Proc Annu ACM SIAM Symp Discret Algorithms. 2016 Jan;2016:558-569. doi: 10.1137/1.9781611974331.ch41.
10
Preconditioned Stochastic Gradient Descent.预处理随机梯度下降。
IEEE Trans Neural Netw Learn Syst. 2018 May;29(5):1454-1466. doi: 10.1109/TNNLS.2017.2672978. Epub 2017 Mar 9.

引用本文的文献

1
A Sharp Estimate on the Transient Time of Distributed Stochastic Gradient Descent.关于分布式随机梯度下降瞬态时间的精确估计
IEEE Trans Automat Contr. 2022 Nov;67(11):5900-5915. doi: 10.1109/tac.2021.3126253. Epub 2021 Nov 9.

本文引用的文献

1
A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization.一种基于分布鲁棒优化的回归模型稳健学习方法。
J Mach Learn Res. 2018 Jan;19(1):517-564. Epub 2018 Jan 1.
2
Robust Asynchronous Stochastic Gradient-Push: Asymptotically Optimal and Network-Independent Performance for Strongly Convex Functions.鲁棒异步随机梯度推送:强凸函数的渐近最优和与网络无关的性能
J Mach Learn Res. 2020;21.
3
Federated learning of predictive models from federated Electronic Health Records.从联邦电子健康记录中联合学习预测模型。
Int J Med Inform. 2018 Apr;112:59-67. doi: 10.1016/j.ijmedinf.2018.01.007. Epub 2018 Jan 12.