Wainakh Aidmar, Zimmer Ephraim, Subedi Sandeep, Keim Jens, Grube Tim, Karuppayah Shankar, Sanchez Guinea Alejandro, Mühlhäuser Max
Telecooperation Lab, Technical University of Darmstadt, 64289 Darmstadt, Germany.
National Advanced IPv6 Centre (NAv6), University of Science Malaysia, Penang 11800, Malaysia.
Sensors (Basel). 2022 Dec 20;23(1):31. doi: 10.3390/s23010031.
Deep learning pervades heavy data-driven disciplines in research and development. The Internet of Things and sensor systems, which enable smart environments and services, are settings where deep learning can provide invaluable utility. However, the data in these systems are very often directly or indirectly related to people, which raises privacy concerns. Federated learning (FL) mitigates some of these concerns and empowers deep learning in sensor-driven environments by enabling multiple entities to collaboratively train a machine learning model without sharing their data. Nevertheless, a number of works in the literature propose attacks that can manipulate the model and disclose information about the training data in FL. As a result, there has been a growing belief that FL is highly vulnerable to severe attacks. Although these attacks do indeed highlight security and privacy risks in FL, some of them may not be as effective in production deployment because they are feasible only given special-sometimes impractical-assumptions. In this paper, we investigate this issue by conducting a quantitative analysis of the attacks against FL and their evaluation settings in 48 papers. This analysis is the first of its kind to reveal several research gaps with regard to the types and architectures of target models. Additionally, the quantitative analysis allows us to highlight unrealistic assumptions in some attacks related to the hyper-parameters of the model and data distribution. Furthermore, we identify fallacies in the evaluation of attacks which raise questions about the generalizability of the conclusions. As a remedy, we propose a set of recommendations to promote adequate evaluations.
深度学习在研发中广泛应用于大量数据驱动的学科领域。物联网和传感器系统能够实现智能环境与服务,是深度学习可提供巨大实用价值的场景。然而,这些系统中的数据往往直接或间接地与人员相关,这引发了隐私担忧。联邦学习(FL)缓解了其中一些担忧,并通过使多个实体能够在不共享数据的情况下协作训练机器学习模型,在传感器驱动的环境中推动了深度学习的发展。尽管如此,文献中的一些研究提出了可以操纵模型并泄露联邦学习中训练数据信息的攻击方法。因此,人们越来越认为联邦学习极易受到严重攻击。虽然这些攻击确实凸显了联邦学习中的安全和隐私风险,但其中一些攻击在实际生产部署中可能并不那么有效,因为它们仅在特殊的(有时不切实际的)假设下才可行。在本文中,我们通过对48篇论文中针对联邦学习的攻击及其评估设置进行定量分析来研究这个问题。这种分析是首次揭示在目标模型的类型和架构方面存在的几个研究差距。此外,定量分析使我们能够突出一些与模型超参数和数据分布相关的攻击中不现实的假设。此外,我们还发现了攻击评估中的谬误,这些谬误对结论的可推广性提出了质疑。作为补救措施,我们提出了一套建议以促进充分的评估。