Chen Aiguo, Fu Yang, Sha Zexin, Lu Guoming
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China.
Front Plant Sci. 2022 Jun 9;13:908814. doi: 10.3389/fpls.2022.908814. eCollection 2022.
Federated learning is a distributed machine learning framework that enables distributed nodes with computation and storage capabilities to train a global model while keeping distributed-stored data locally. This process can promote the efficiency of modeling while preserving data privacy. Therefore, federated learning can be widely applied in distributed conjoint analysis scenarios, such as smart plant protection systems, in which widely networked IoT devices are used to monitor the critical data of plant production to improve crop production. However, the data collected by different IoT devices can be dependent and identically distributed (non-IID), causing the challenge of statistical heterogeneity. Studies have also shown that statistical heterogeneity can lead to a marked decline in the efficiency of federated learning, making it challenging to apply in practice. To promote the efficiency of federated learning in statistical heterogeneity scenarios, an adaptive client selection algorithm for federated learning in statistical heterogeneous scenarios called ACSFed is proposed in this paper. ACSFed can dynamically calculate the possibility of clients being selected to train the model for each communication round based on their local statistical heterogeneity and previous training performance instead of randomly selected clients, and clients with heavier statistical heterogeneity or bad training performance would be more likely selected to participate in the later training. This client selection strategy can enable the federated model to learn the global statistical knowledge faster and thereby promote the convergence of the federated model. Multiple experiments on public benchmark datasets demonstrate these improvements in the efficiency of the models in heterogeneous settings.
联邦学习是一种分布式机器学习框架,它使具有计算和存储能力的分布式节点能够在本地保留分布式存储的数据的同时训练全局模型。这一过程可以提高建模效率,同时保护数据隐私。因此,联邦学习可以广泛应用于分布式联合分析场景,如智能植物保护系统,其中广泛联网的物联网设备用于监测植物生产的关键数据以提高作物产量。然而,不同物联网设备收集的数据可能是相关且同分布(非独立同分布)的,这带来了统计异质性的挑战。研究还表明,统计异质性会导致联邦学习效率显著下降,使其在实际应用中具有挑战性。为了提高联邦学习在统计异质性场景中的效率,本文提出了一种用于统计异构场景下联邦学习的自适应客户端选择算法ACSFed。ACSFed可以基于客户端的本地统计异质性和先前的训练性能,而不是随机选择客户端,为每个通信轮动态计算客户端被选中训练模型的可能性,统计异质性较重或训练性能较差的客户端更有可能被选中参与后续训练。这种客户端选择策略可以使联邦模型更快地学习全局统计知识,从而促进联邦模型的收敛。在公共基准数据集上进行的多次实验证明了这些改进在异构设置下对模型效率的提升。