Khan Koffka
Department of Computing and Information Technology, Faculty of Science and Technology, The University of the West Indies, St. Augustine Campus, St. Augustine 350462, Trinidad and Tobago.
Sensors (Basel). 2025 Jun 14;25(12):3728. doi: 10.3390/s25123728.
Federated Learning (FL) is a distributed machine learning paradigm where a global model is collaboratively trained across multiple decentralized clients without exchanging raw data. This is especially important in sensor networks and edge intelligence, where data privacy, bandwidth constraints, and data locality are paramount. Traditional FL methods like FedAvg struggle with highly heterogeneous (non-IID) client data, which is common in these settings. Traditional FL aggregation methods, such as FedAvg, weigh client updates primarily by dataset size, potentially overlooking the informativeness or diversity of each client's contribution. These limitations are especially pronounced in sensor networks and IoT environments, where clients may hold sparse, unbalanced, or single-modality data. We propose , an entropy-guided aggregation approach that adjusts each client's impact on the global model based on the information entropy of its local data distribution. This formulation introduces a principled way to quantify and reward data diversity, enabling an emergent collective learning dynamic in which globally informative updates drive convergence. Unlike existing methods that weigh updates by sample count or heuristics, FedEmerge prioritizes clients with more representative, high-entropy data. The FedEmerge algorithm is presented with full mathematical detail, and we prove its convergence under the Polyak-Łojasiewicz (PL) condition. Theoretical analysis shows that FedEmerge achieves linear convergence to the optimal model under standard assumptions (smoothness and PL condition), similar to centralized gradient descent. Empirically, FedEmerge improves global model accuracy and convergence speed on highly skewed non-IID benchmarks, and it reduces performance disparities among clients compared to FedAvg. Evaluations on CIFAR-10 (non-IID), Federated EMNIST, and Shakespeare datasets confirm its effectiveness in practical edge-learning settings. This entropy-guided federated strategy demonstrates that weighting client updates by data diversity enhances learning outcomes in heterogeneous networks. The approach preserves privacy like standard FL and adds minimal computation overhead, making it a practical solution for real-world federated systems.
联邦学习(FL)是一种分布式机器学习范式,其中全局模型是在多个分散的客户端上协同训练的,无需交换原始数据。这在传感器网络和边缘智能中尤为重要,因为在这些场景中,数据隐私、带宽限制和数据局部性至关重要。像联邦平均(FedAvg)这样的传统FL方法在处理高度异构(非独立同分布)的客户端数据时会遇到困难,而这种数据在这些场景中很常见。传统的FL聚合方法,如FedAvg,主要根据数据集大小对客户端更新进行加权,可能会忽略每个客户端贡献的信息性或多样性。这些限制在传感器网络和物联网环境中尤为明显,在这些环境中,客户端可能拥有稀疏、不平衡或单模态数据。我们提出了一种基于熵引导的聚合方法,即FedEmerge,它根据本地数据分布的信息熵来调整每个客户端对全局模型的影响。这种公式引入了一种有原则的方法来量化和奖励数据多样性,从而实现一种新兴的集体学习动态,其中全局信息丰富的更新推动收敛。与现有的按样本数量或启发式方法对更新进行加权的方法不同,FedEmerge优先考虑具有更具代表性、高熵数据的客户端。文中详细给出了FedEmerge算法的完整数学细节,并证明了其在Polyak-Łojasiewicz(PL)条件下的收敛性。理论分析表明,在标准假设(平滑性和PL条件)下,FedEmerge与集中式梯度下降类似,能实现对最优模型的线性收敛。从经验上看,FedEmerge在高度倾斜的非独立同分布基准上提高了全局模型的准确性和收敛速度,并且与FedAvg相比,它减少了客户端之间的性能差异。在CIFAR-10(非独立同分布)、联邦EMNIST和莎士比亚数据集上的评估证实了其在实际边缘学习场景中的有效性。这种基于熵引导的联邦策略表明,按数据多样性对客户端更新进行加权可以提高异构网络中的学习效果。该方法像标准FL一样保护隐私,并且增加的计算开销最小,使其成为实际联邦系统的实用解决方案。