Gao Tianrun, Liu Keyan, Yang Yuning, Liu Xiaohong, Zhang Ping, Wang Guangyu
IEEE J Biomed Health Inform. 2025 May 6;PP. doi: 10.1109/JBHI.2025.3567055.
Federated learning (FL) has emerged as a promising distributed paradigm that enables collaborative model training while preserving data privacy, but it suffers from performance degradation due to data heterogeneity. Although clustered federated learning (CFL) attempts to address this challenge by grouping clients with similar data distributions, existing methods are inefficient in capturing client data representations, leading to incorrect cluster identities and inferior cluster performance. To overcome these limitations, we propose an efficient prototype-based CFL framework (FedPC). Specifically, we introduce a dual-prototype strategy combining specific prototypes and generalized prototypes to capture class representations for cluster identities, along with a prototype-contrastive training mechanism that maximizes intra-cluster prototype consistency to improve cluster performance. Extensive experiments on medical imaging datasets (BloodMNIST and DermaMNIST) demonstrate that the FedPC outperforms nine state-of-the-art (SOTA) approaches, achieving average improvements of 2.17% and 3.47%, respectively. Furthermore, the FedPC reduces communication overhead by 3.33 to 5.68 times compared to SOTA methods, showcasing its efficiency in real-world FL scenarios.
联邦学习(FL)已成为一种很有前途的分布式范式,它能够在保护数据隐私的同时进行协作式模型训练,但由于数据异质性,它会出现性能下降的问题。尽管聚类联邦学习(CFL)试图通过将具有相似数据分布的客户端分组来应对这一挑战,但现有方法在捕获客户端数据表示方面效率低下,导致聚类标识错误和聚类性能较差。为了克服这些局限性,我们提出了一种基于原型的高效CFL框架(FedPC)。具体来说,我们引入了一种结合特定原型和通用原型的双原型策略来捕获聚类标识的类表示,以及一种原型对比训练机制,该机制最大化聚类内原型一致性以提高聚类性能。在医学成像数据集(BloodMNIST和DermaMNIST)上进行的大量实验表明,FedPC优于九种最新的(SOTA)方法,平均分别提高了2.17%和3.47%。此外,与SOTA方法相比,FedPC将通信开销降低了3.33至5.68倍,展示了其在实际联邦学习场景中的效率。