生成式对抗网络的可调隐私风险评估。

Tunable Privacy Risk Evaluation of Generative Adversarial Networks.

机构信息

Biomedical Data Science Center, Lausanne University Hospital (CHUV) and University of Lausanne, Switzerland.

出版信息

Stud Health Technol Inform. 2024 Aug 22;316:1233-1237. doi: 10.3233/SHTI240634.

Abstract

Generative machine learning models such as Generative Adversarial Networks (GANs) have been shown to be especially successful in generating realistic synthetic data in image and tabular domains. However, it has been shown that such generative models, as well as the generated synthetic data, can reveal information contained in their privacy-sensitive training data, and therefore must be carefully evaluated before being used. The gold standard method through which such privacy leakage can be estimated is simulating membership inference attacks (MIAs), in which an attacker attempts to learn whether a given sample was part of the training data of a generative model. The state-of-the art MIAs against generative models, however, rely on strong assumptions (knowledge of the exact training dataset size), or require a lot of computational power (to retrain many "surrogate" generative models), which make them hard to use in practice. In this work, we propose a technique for evaluating privacy risks in GANs which exploits the outputs of the discriminator part of the standard GAN architecture. We evaluate our attacks in terms of performance in two synthetic image generation applications in radiology and ophthalmology, showing that our technique provides a more complete picture of the threats by performing worst-case privacy risk estimation and by identifying attacks with higher precision than the prior work.

摘要

生成式机器学习模型，如生成式对抗网络（GAN），在图像和表格领域生成逼真的合成数据方面表现出色。然而，已经表明，这些生成模型以及生成的合成数据可能会揭示其隐私敏感训练数据中包含的信息，因此在使用之前必须进行仔细评估。估计这种隐私泄露的黄金标准方法是模拟成员推断攻击（MIA），攻击者试图学习给定样本是否是生成模型的训练数据的一部分。然而，针对生成模型的最先进的 MIA 依赖于很强的假设（确切的训练数据集大小的知识），或者需要大量的计算能力（重新训练许多“替代”生成模型），这使得它们难以在实践中使用。在这项工作中，我们提出了一种利用标准 GAN 架构的判别器部分输出来评估 GAN 中隐私风险的技术。我们根据在放射学和眼科学中的两个合成图像生成应用中的性能来评估我们的攻击，表明我们的技术通过执行最坏情况下的隐私风险估计，并通过识别比先前工作具有更高精度的攻击，提供了对威胁的更全面的了解。