Suppr超能文献

去中心化联邦平均

Decentralized Federated Averaging.

作者信息

Sun Tao, Li Dongsheng, Wang Bao

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Apr;45(4):4289-4301. doi: 10.1109/TPAMI.2022.3196503. Epub 2023 Mar 7.

Abstract

Federated averaging (FedAvg) is a communication-efficient algorithm for distributed training with an enormous number of clients. In FedAvg, clients keep their data locally for privacy protection; a central parameter server is used to communicate between clients. This central server distributes the parameters to each client and collects the updated parameters from clients. FedAvg is mostly studied in centralized fashions, requiring massive communications between the central server and clients, which leads to possible channel blocking. Moreover, attacking the central server can break the whole system's privacy. Indeed, decentralization can significantly reduce the communication of the busiest node (the central one) because all nodes only communicate with their neighbors. To this end, in this paper, we study the decentralized FedAvg with momentum (DFedAvgM), implemented on clients that are connected by an undirected graph. In DFedAvgM, all clients perform stochastic gradient descent with momentum and communicate with their neighbors only. To further reduce the communication cost, we also consider the quantized DFedAvgM. The proposed algorithm involves the mixing matrix, momentum, client training with multiple local iterations, and quantization, introducing extra items in the Lyapunov analysis. Thus, the analysis of this paper is much more challenging than previous decentralized (momentum) SGD or FedAvg. We prove convergence of the (quantized) DFedAvgM under trivial assumptions; the convergence rate can be improved to sublinear when the loss function satisfies the PŁ property. Numerically, we find that the proposed algorithm outperforms FedAvg in both convergence speed and communication cost.

摘要

联邦平均算法(FedAvg)是一种用于大量客户端分布式训练的通信高效算法。在FedAvg中,客户端将数据保存在本地以保护隐私;使用一个中央参数服务器在客户端之间进行通信。这个中央服务器将参数分发给每个客户端,并从客户端收集更新后的参数。FedAvg大多以集中式方式进行研究,这需要中央服务器和客户端之间进行大量通信,这可能导致信道阻塞。此外,攻击中央服务器可能会破坏整个系统的隐私。实际上,去中心化可以显著减少最繁忙节点(中央节点)的通信量,因为所有节点只与它们的邻居通信。为此,在本文中,我们研究了带动量的去中心化FedAvg(DFedAvgM),它在由无向图连接的客户端上实现。在DFedAvgM中,所有客户端都执行带动量的随机梯度下降,并且只与它们的邻居通信。为了进一步降低通信成本,我们还考虑了量化的DFedAvgM。所提出的算法涉及混合矩阵、动量、具有多个本地迭代的客户端训练以及量化,这在李雅普诺夫分析中引入了额外的项。因此,本文的分析比以前的去中心化(带动量)随机梯度下降或FedAvg更具挑战性。我们在平凡假设下证明了(量化的)DFedAvgM的收敛性;当损失函数满足PŁ性质时,收敛速度可以提高到次线性。在数值上,我们发现所提出的算法在收敛速度和通信成本方面都优于FedAvg。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验