Suppr超能文献

通过无生成器的数据生成实现非独立同分布联邦学习的数据自由知识蒸馏。

Data-free knowledge distillation via generator-free data generation for Non-IID federated learning.

机构信息

Sun Yat-sen University, School of computer science and engineering, Guangzhou, China.

Sun Yat-sen University, School of software engineering, Zhuhai, China.

出版信息

Neural Netw. 2024 Nov;179:106627. doi: 10.1016/j.neunet.2024.106627. Epub 2024 Aug 10.

Abstract

Data heterogeneity (Non-IID) on Federated Learning (FL) is currently a widely publicized problem, which leads to local model drift and performance degradation. Because of the advantage of knowledge distillation, it has been explored in some recent work to refine global models. However, these approaches rely on a proxy dataset or a data generator. First, in many FL scenarios, proxy dataset do not necessarily exist on the server. Second, the quality of data generated by the generator is unstable and the generator depends on the computing resources of the server. In this work, we propose a novel data-Free knowledge distillation approach via generator-Free Data Generation for Non-IID FL, dubbed as FedFDG. Specifically, FedFDG requires only local models to generate pseudo datasets for each client, and can generate hard samples by adding an additional regularization term that exploit disagreements between local model and global model. Meanwhile, FedFDG enables flexible utilization of computational resources by generating pseudo dataset locally or on the server. And to address the label distribution shift in Non-IID FL, we propose a Data Generation Principle that can adaptively control the label distribution and number of pseudo dataset based on client current state, and this allows for the extraction of more client knowledge. Then knowledge distillation is performed to transfer the knowledge in local models to the global model. Extensive experiments demonstrate that our proposed method significantly outperforms the state-of-the-art FL methods and can serve as plugin for existing Federated Learning methds such as FedAvg, FedProx, etc, and improve their performance.

摘要

联邦学习(FL)中的数据异质性(非IID)目前是一个广为人知的问题,这导致了局部模型漂移和性能下降。由于知识蒸馏的优势,它已经在一些最近的工作中被探索用于精炼全局模型。然而,这些方法依赖于代理数据集或数据生成器。首先,在许多 FL 场景中,代理数据集在服务器上不一定存在。其次,生成器生成的数据质量不稳定,并且生成器依赖于服务器的计算资源。在这项工作中,我们提出了一种新颖的数据免费知识蒸馏方法,通过生成器免费的数据生成来解决非 IID FL 问题,称为 FedFDG。具体来说,FedFDG 只需要本地模型为每个客户端生成伪数据集,并可以通过添加一个额外的正则化项来生成硬样本,该正则化项利用了本地模型和全局模型之间的分歧。同时,FedFDG 通过在本地或服务器上生成伪数据集,实现了对计算资源的灵活利用。为了解决非 IID FL 中的标签分布偏移问题,我们提出了一种数据生成原则,可以根据客户端的当前状态自适应地控制伪数据集的标签分布和数量,从而可以提取更多的客户端知识。然后进行知识蒸馏,将本地模型中的知识转移到全局模型中。大量实验表明,我们提出的方法显著优于最先进的 FL 方法,并可以作为现有联邦学习方法(如 FedAvg、FedProx 等)的插件,提高它们的性能。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验