Lucieri Adriano, Dengel Andreas, Ahmed Sheraz
Smart Data and Knowledge Services (SDS), Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI) GmbH, Kaiserslautern, Germany.
Computer Science Department, RPTU Kaiserslautern-Landau, Kaiserslautern, Germany.
Front Bioinform. 2023 Jul 5;3:1194993. doi: 10.3389/fbinf.2023.1194993. eCollection 2023.
Artificial Intelligence (AI) has achieved remarkable success in image generation, image analysis, and language modeling, making data-driven techniques increasingly relevant in practical real-world applications, promising enhanced creativity and efficiency for human users. However, the deployment of AI in high-stakes domains such as infrastructure and healthcare still raises concerns regarding algorithm accountability and safety. The emerging field of explainable AI (XAI) has made significant strides in developing interfaces that enable humans to comprehend the decisions made by data-driven models. Among these approaches, concept-based explainability stands out due to its ability to align explanations with high-level concepts familiar to users. Nonetheless, early research in adversarial machine learning has unveiled that exposing model explanations can render victim models more susceptible to attacks. This is the first study to investigate and compare the impact of concept-based explanations on the privacy of Deep Learning based AI models in the context of biomedical image analysis. An extensive privacy benchmark is conducted on three different state-of-the-art model architectures (ResNet50, NFNet, ConvNeXt) trained on two biomedical (ISIC and EyePACS) and one synthetic dataset (SCDB). The success of membership inference attacks while exposing varying degrees of attribution-based and concept-based explanations is systematically compared. The findings indicate that, in theory, concept-based explanations can potentially increase the vulnerability of a private AI system by up to 16% compared to attributions in the baseline setting. However, it is demonstrated that, in more realistic attack scenarios, the threat posed by explanations is negligible in practice. Furthermore, actionable recommendations are provided to ensure the safe deployment of concept-based XAI systems. In addition, the impact of differential privacy (DP) on the quality of concept-based explanations is explored, revealing that while negatively influencing the explanation ability, DP can have an adverse effect on the models' privacy.
人工智能(AI)在图像生成、图像分析和语言建模方面取得了显著成功,使得数据驱动技术在实际的现实世界应用中越来越重要,有望为人类用户提高创造力和效率。然而,在基础设施和医疗保健等高风险领域部署人工智能仍引发了对算法问责制和安全性的担忧。可解释人工智能(XAI)这一新兴领域在开发使人类能够理解数据驱动模型所做决策的接口方面取得了重大进展。在这些方法中,基于概念的可解释性因其能够将解释与用户熟悉的高级概念相结合而脱颖而出。尽管如此,对抗机器学习的早期研究表明,暴露模型解释会使受害模型更容易受到攻击。这是第一项在生物医学图像分析背景下研究和比较基于概念的解释对深度学习人工智能模型隐私影响的研究。我们在三个不同的、最先进的模型架构(ResNet50、NFNet、ConvNeXt)上进行了广泛的隐私基准测试,这些模型架构是在两个生物医学(ISIC和EyePACS)和一个合成数据集(SCDB)上训练的。系统地比较了在暴露不同程度的基于归因和基于概念的解释时成员推理攻击的成功率。研究结果表明,从理论上讲,与基线设置中的归因相比,基于概念的解释可能会使私有人工智能系统的脆弱性增加高达16%。然而,事实证明,在更现实的攻击场景中,解释所带来的威胁在实践中可以忽略不计。此外,还提供了可行的建议,以确保基于概念的XAI系统的安全部署。此外,还探讨了差分隐私(DP)对基于概念的解释质量的影响,结果表明,虽然DP会对解释能力产生负面影响,但它也会对模型的隐私产生不利影响。