一种用于神经网络并行和分布式训练的框架。

A framework for parallel and distributed training of neural networks.

作者信息

Scardapane Simone, Di Lorenzo Paolo

机构信息

Department of Information Engineering, Electronics and Telecommunications, "Sapienza" University of Rome, Via Eudossiana 18, 00184 Rome, Italy.

Department of Engineering, University of Perugia, Via G. Duranti 93, 06125, Perugia, Italy.

出版信息

Neural Netw. 2017 Jul;91:42-54. doi: 10.1016/j.neunet.2017.04.004. Epub 2017 Apr 19.

DOI:10.1016/j.neunet.2017.04.004

PMID:28478372

Abstract

The aim of this paper is to develop a general framework for training neural networks (NNs) in a distributed environment, where training data is partitioned over a set of agents that communicate with each other through a sparse, possibly time-varying, connectivity pattern. In such distributed scenario, the training problem can be formulated as the (regularized) optimization of a non-convex social cost function, given by the sum of local (non-convex) costs, where each agent contributes with a single error term defined with respect to its local dataset. To devise a flexible and efficient solution, we customize a recently proposed framework for non-convex optimization over networks, which hinges on a (primal) convexification-decomposition technique to handle non-convexity, and a dynamic consensus procedure to diffuse information among the agents. Several typical choices for the training criterion (e.g., squared loss, cross entropy, etc.) and regularization (e.g., ℓ norm, sparsity inducing penalties, etc.) are included in the framework and explored along the paper. Convergence to a stationary solution of the social non-convex problem is guaranteed under mild assumptions. Additionally, we show a principled way allowing each agent to exploit a possible multi-core architecture (e.g., a local cloud) in order to parallelize its local optimization step, resulting in strategies that are both distributed (across the agents) and parallel (inside each agent) in nature. A comprehensive set of experimental results validate the proposed approach.

摘要

本文的目的是开发一个在分布式环境中训练神经网络（NN）的通用框架，其中训练数据分布在一组通过稀疏的、可能随时间变化的连接模式相互通信的智能体上。在这种分布式场景中，训练问题可以被表述为一个非凸社会成本函数的（正则化）优化问题，该函数由局部（非凸）成本之和给出，其中每个智能体贡献一个相对于其局部数据集定义的单个误差项。为了设计一个灵活高效的解决方案，我们定制了一个最近提出的用于网络非凸优化的框架，该框架依赖于一种（原始）凸化分解技术来处理非凸性，以及一种动态共识过程来在智能体之间传播信息。训练准则（例如，平方损失、交叉熵等）和正则化（例如，ℓ范数、稀疏诱导惩罚等）的几种典型选择都包含在该框架中，并在本文中进行了探讨。在温和假设下，保证收敛到社会非凸问题的一个平稳解。此外，我们展示了一种有原则的方法，允许每个智能体利用可能的多核架构（例如，本地云）来并行化其局部优化步骤，从而产生本质上既是分布式（跨智能体）又是并行（在每个智能体内）的策略。一组全面的实验结果验证了所提出的方法。

相似文献

A framework for parallel and distributed training of neural networks.一种用于神经网络并行和分布式训练的框架。

Neural Netw. 2017 Jul;91:42-54. doi: 10.1016/j.neunet.2017.04.004. Epub 2017 Apr 19.

Piecewise convexity of artificial neural networks.人工神经网络的分段凸性。

Neural Netw. 2017 Oct;94:34-45. doi: 10.1016/j.neunet.2017.06.009. Epub 2017 Jul 3.

A neurodynamic approach to convex optimization problems with general constraint.具有一般约束的凸优化问题的神经动力学方法

Neural Netw. 2016 Dec;84:113-124. doi: 10.1016/j.neunet.2016.08.014. Epub 2016 Sep 9.

Stochastic Training of Neural Networks via Successive Convex Approximations.

IEEE Trans Neural Netw Learn Syst. 2018 Oct;29(10):4947-4956. doi: 10.1109/TNNLS.2017.2785361. Epub 2018 Jan 15.

A distributed semi-supervised learning algorithm based on manifold regularization using wavelet neural network.基于流形正则化的小波神经网络的分布式半监督学习算法。

Neural Netw. 2019 Oct;118:300-309. doi: 10.1016/j.neunet.2018.10.014. Epub 2018 Nov 14.

Regularized Primal-Dual Subgradient Method for Distributed Constrained Optimization.正则化主对偶子梯度法用于分布式约束优化。

IEEE Trans Cybern. 2016 Sep;46(9):2109-18. doi: 10.1109/TCYB.2015.2464255. Epub 2015 Aug 13.

Distributed computing methodology for training neural networks in an image-guided diagnostic application.用于图像引导诊断应用中训练神经网络的分布式计算方法。

Comput Methods Programs Biomed. 2006 Mar;81(3):228-35. doi: 10.1016/j.cmpb.2005.11.005. Epub 2006 Feb 14.

Neural network for nonsmooth pseudoconvex optimization with general convex constraints.具有一般凸约束的非光滑伪凸优化的神经网络。

Neural Netw. 2018 May;101:1-14. doi: 10.1016/j.neunet.2018.01.008. Epub 2018 Feb 5.

Structured Sparsity Optimization With Non-Convex Surrogates of l-Norm: A Unified Algorithmic Framework.基于l-范数非凸替代的结构化稀疏优化：一个统一的算法框架

IEEE Trans Pattern Anal Mach Intell. 2023 May;45(5):6386-6402. doi: 10.1109/TPAMI.2022.3213716. Epub 2023 Apr 3.

Distributed semi-supervised support vector machines.分布式半监督支持向量机

Neural Netw. 2016 Aug;80:43-52. doi: 10.1016/j.neunet.2016.04.007. Epub 2016 Apr 27.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种用于神经网络并行和分布式训练的框架。

A framework for parallel and distributed training of neural networks.

作者信息

机构信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献