Wang Haoming, Gao Wei
University of Pittsburgh.
Proc AAAI Conf Artif Intell. 2025;39(20):21080-21089. doi: 10.1609/aaai.v39i20.35405. Epub 2025 Apr 11.
Federated Learning (FL) can be affected by data and device heterogeneities, caused by clients' different local data distributions and latencies in uploading model updates (i.e., staleness). Traditional schemes consider these heterogeneities as two separate and independent aspects, but this assumption is unrealistic in practical FL scenarios where these heterogeneities are intertwined. In these cases, traditional FL schemes are ineffective, and a better approach is to convert a stale model update into a unstale one. In this paper, we present a new FL framework that ensures the accuracy and computational efficiency of this conversion, hence effectively tackling the intertwined heterogeneities that may cause unlimited staleness in model updates. Our basic idea is to estimate the distributions of clients' local training data from their uploaded stale model updates, and use these estimations to compute unstale client model updates. In this way, our approach does not require any auxiliary dataset nor the clients' local models to be fully trained, and does not incur any additional computation or communication overhead at client devices. We compared our approach with the existing FL strategies on mainstream datasets and models, and showed that our approach can improve the trained model accuracy by up to 25% and reduce the number of required training epochs by up to 35%. Source codes can be found at: https://github.com/pittisl/FL-with-intertwined-heterogeneity.
联邦学习(FL)可能会受到数据和设备异构性的影响,这些异构性是由客户端不同的本地数据分布以及上传模型更新时的延迟(即陈旧性)所导致的。传统方案将这些异构性视为两个相互独立的方面,但在实际的联邦学习场景中,这种假设并不现实,因为在这些场景中,这些异构性是相互交织的。在这种情况下,传统的联邦学习方案效果不佳,更好的方法是将陈旧的模型更新转换为非陈旧的更新。在本文中,我们提出了一种新的联邦学习框架,该框架确保了这种转换的准确性和计算效率,从而有效地解决了可能导致模型更新中出现无限陈旧性的相互交织的异构性问题。我们的基本思想是从客户端上传的陈旧模型更新中估计其本地训练数据的分布,并利用这些估计来计算非陈旧的客户端模型更新。通过这种方式,我们的方法不需要任何辅助数据集,也不需要客户端的本地模型完全训练完成,并且不会在客户端设备上产生任何额外的计算或通信开销。我们在主流数据集和模型上,将我们的方法与现有的联邦学习策略进行了比较,结果表明我们的方法可以将训练模型的准确率提高多达25%,并将所需的训练轮数减少多达35%。源代码可在以下网址找到:https://github.com/pittisl/FL-with-intertwined-heterogeneity 。