IEEE Trans Med Imaging. 2023 Jul;42(7):2044-2056. doi: 10.1109/TMI.2023.3239391. Epub 2023 Jun 30.
Federated learning (FL) allows the collaborative training of AI models without needing to share raw data. This capability makes it especially interesting for healthcare applications where patient and data privacy is of utmost concern. However, recent works on the inversion of deep neural networks from model gradients raised concerns about the security of FL in preventing the leakage of training data. In this work, we show that these attacks presented in the literature are impractical in FL use-cases where the clients' training involves updating the Batch Normalization (BN) statistics and provide a new baseline attack that works for such scenarios. Furthermore, we present new ways to measure and visualize potential data leakage in FL. Our work is a step towards establishing reproducible methods of measuring data leakage in FL and could help determine the optimal tradeoffs between privacy-preserving techniques, such as differential privacy, and model accuracy based on quantifiable metrics.
联邦学习(FL)允许在不共享原始数据的情况下协同训练人工智能模型。这种能力使其在医疗保健应用中特别有趣,因为患者和数据隐私是最关心的问题。然而,最近关于从模型梯度反演深度神经网络的工作引起了对 FL 在防止训练数据泄露方面的安全性的担忧。在这项工作中,我们表明,在客户端培训涉及更新批量归一化(BN)统计信息的 FL 用例中,文献中提出的这些攻击是不切实际的,并提供了一种新的基线攻击,可以在这种情况下工作。此外,我们提出了新的方法来衡量和可视化 FL 中潜在的数据泄露。我们的工作是朝着建立在 FL 中衡量数据泄露的可重复方法迈出的一步,并有助于根据可量化的指标确定在隐私保护技术(如差分隐私)和模型准确性之间的最佳权衡。