Yan Yihan, Tong Xiaojun, Wang Shen
IEEE Trans Neural Netw Learn Syst. 2024 Sep;35(9):12796-12809. doi: 10.1109/TNNLS.2023.3264740. Epub 2024 Sep 3.
Federated learning (FL) is a distributed machine learning framework that allows resource-constrained clients to train a global model jointly without compromising data privacy. Although FL is widely adopted, high degrees of systems and statistical heterogeneity are still two main challenges, which leads to potential divergence and nonconvergence. Clustered FL handles the problem of statistical heterogeneity straightly by discovering the geometric structure of clients with various data generation distributions and getting multiple global models. The number of clusters contains prior knowledge about the clustering structure and has a significant impact on the performance of clustered FL methods. Existing clustered FL methods are inadequate for adaptively inferring the optimal number of clusters in environments with high systems' heterogeneity. To address this issue, we propose an iterative clustered FL (ICFL) framework in which the server dynamically discovers the clustering structure by successively performing incremental clustering and clustering in one iteration. We focus on the average connectivity within each cluster and give incremental clustering and clustering methods that are compatible with ICFL based on mathematical analysis. We evaluate ICFL in experiments on high degrees of systems and statistical heterogeneity, multiple datasets, and convex and nonconvex objectives. Experimental results verify our theoretical analysis and show that ICFL outperforms several clustered FL baseline methods.
联邦学习(FL)是一种分布式机器学习框架,它允许资源受限的客户端联合训练一个全局模型,同时不损害数据隐私。尽管联邦学习被广泛采用,但高度的系统和统计异质性仍然是两个主要挑战,这可能导致潜在的发散和不收敛。聚类联邦学习通过发现具有不同数据生成分布的客户端的几何结构并获得多个全局模型,直接处理统计异质性问题。聚类数量包含有关聚类结构的先验知识,并且对聚类联邦学习方法的性能有重大影响。现有的聚类联邦学习方法不足以在具有高系统异质性的环境中自适应地推断最优聚类数量。为了解决这个问题,我们提出了一种迭代聚类联邦学习(ICFL)框架,其中服务器通过在一次迭代中依次执行增量聚类和聚类来动态发现聚类结构。我们关注每个聚类内的平均连通性,并基于数学分析给出与ICFL兼容的增量聚类和聚类方法。我们在高度系统和统计异质性、多个数据集以及凸和非凸目标的实验中评估ICFL。实验结果验证了我们的理论分析,并表明ICFL优于几种聚类联邦学习基线方法。