Aledhari Mohammed, Razzak Rehma, Parizi Reza M, Saeed Fahad
College of Computing and Software Engineering, Kennesaw State University, Marietta, GA, 30060 USA.
School of Computing and Information Sciences, Florida International University, Miami, FL, 33199 USA.
IEEE Access. 2020;8:140699-140725. doi: 10.1109/access.2020.3013541. Epub 2020 Jul 31.
This paper provides a comprehensive study of Federated Learning (FL) with an emphasis on enabling software and hardware platforms, protocols, real-life applications and use-cases. FL can be applicable to multiple domains but applying it to different industries has its own set of obstacles. FL is known as collaborative learning, where algorithm(s) get trained across multiple devices or servers with decentralized data samples without having to exchange the actual data. This approach is radically different from other more established techniques such as getting the data samples uploaded to servers or having data in some form of distributed infrastructure. FL on the other hand generates more robust models without sharing data, leading to privacy-preserved solutions with higher security and access privileges to data. This paper starts by providing an overview of FL. Then, it gives an overview of technical details that pertain to FL enabling technologies, protocols, and applications. Compared to other survey papers in the field, our objective is to provide a more thorough summary of the most relevant protocols, platforms, and real-life use-cases of FL to enable data scientists to build better privacy-preserving solutions for industries in critical need of FL. We also provide an overview of key challenges presented in the recent literature and provide a summary of related research work. Moreover, we explore both the challenges and advantages of FL and present detailed service use-cases to illustrate how different architectures and protocols that use FL can fit together to deliver desired results.
本文对联邦学习(FL)进行了全面研究,重点关注其在软件和硬件平台、协议、实际应用及用例方面的实现。FL可应用于多个领域,但将其应用于不同行业存在一系列障碍。FL被称为协作学习,即算法通过分散的数据样本在多个设备或服务器上进行训练,而无需交换实际数据。这种方法与其他更为成熟的技术截然不同,比如将数据样本上传到服务器或采用某种分布式基础设施形式存储数据。另一方面,FL在不共享数据的情况下生成更强大的模型,从而实现具有更高安全性和数据访问权限的隐私保护解决方案。本文首先对FL进行概述。然后,介绍与FL支持技术、协议及应用相关的技术细节。与该领域的其他综述论文相比,我们的目标是更全面地总结FL最相关的协议、平台及实际用例,以使数据科学家能够为急需FL的行业构建更好的隐私保护解决方案。我们还概述了近期文献中提出的关键挑战,并总结了相关研究工作。此外,我们探讨了FL的挑战与优势,并给出详细的服务用例,以说明使用FL的不同架构和协议如何协同工作以实现预期结果。