Li Ling, Zhu Lidong, Li Weibang
National Key Laboratory of Wireless Communications, University of Electronic Science and Technology of China, Chengdu 611731, China.
School of Computer Science and Engineering, Southwest Minzu University, Chengdu 610041, China.
Sensors (Basel). 2024 Dec 16;24(24):8028. doi: 10.3390/s24248028.
Cloud-edge-end computing architecture is crucial for large-scale edge data processing and analysis. However, the diversity of terminal nodes and task complexity in this architecture often result in non-independent and identically distributed (non-IID) data, making it challenging to balance data heterogeneity and privacy protection. To address this, we propose a privacy-preserving federated learning method based on cloud-edge-end collaboration. Our method fully considers the three-tier architecture of cloud-edge-end systems and the non-IID nature of terminal node data. It enhances model accuracy while protecting the privacy of terminal node data. The proposed method groups terminal nodes based on the similarity of their data distributions and constructs edge subnetworks for training in collaboration with edge nodes, thereby mitigating the negative impact of non-IID data. Furthermore, we enhance WGAN-GP with attention mechanism to generate balanced synthetic data while preserving key patterns from original datasets, reducing the adverse effects of non-IID data on global model accuracy while preserving data privacy. In addition, we introduce data resampling and loss function weighting strategies to mitigate model bias caused by imbalanced data distribution. Experimental results on real-world datasets demonstrate that our proposed method significantly outperforms existing approaches in terms of model accuracy, F1-score, and other metrics.
云边端计算架构对于大规模边缘数据处理和分析至关重要。然而,该架构中终端节点的多样性和任务复杂性常常导致数据非独立同分布(non-IID),使得平衡数据异构性和隐私保护具有挑战性。为解决这一问题,我们提出一种基于云边端协作的隐私保护联邦学习方法。我们的方法充分考虑云边端系统的三层架构以及终端节点数据的非IID特性。它在保护终端节点数据隐私的同时提高了模型准确性。所提方法基于数据分布的相似性对终端节点进行分组,并与边缘节点协作构建边缘子网进行训练,从而减轻非IID数据的负面影响。此外,我们用注意力机制增强WGAN-GP,以生成平衡的合成数据,同时保留原始数据集的关键模式,在保护数据隐私的同时减少非IID数据对全局模型准确性的不利影响。另外,我们引入数据重采样和损失函数加权策略,以减轻数据分布不平衡导致的模型偏差。在真实世界数据集上的实验结果表明,我们所提方法在模型准确性、F1分数和其他指标方面显著优于现有方法。