通过潜在空间中的特征操纵生成语义对抗样本。 - Suppr | 超能文献

通过潜在空间中的特征操纵生成语义对抗样本。

Generating Semantic Adversarial Examples via Feature Manipulation in Latent Space.

作者信息

Wang Shuo, Chen Shangyu, Chen Tianle, Nepal Surya, Rudolph Carsten, Grobler Marthie

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Dec;35(12):17070-17084. doi: 10.1109/TNNLS.2023.3299408. Epub 2024 Dec 2.

DOI:10.1109/TNNLS.2023.3299408

PMID:37561624

Abstract

The susceptibility of deep neural networks (DNNs) to adversarial intrusions, exemplified by adversarial examples, is well-documented. Conventional attacks implement unstructured, pixel-wise perturbations to mislead classifiers, which often results in a noticeable departure from natural samples and lacks human-perceptible interpretability. In this work, we present an adversarial attack strategy that implements fine-granularity, semantic-meaning-oriented structural perturbations. Our proposed methodology manipulates the semantic attributes of images through the use of disentangled latent codes. We engineer adversarial perturbations by manipulating either a single latent code or a combination thereof. To this end, we propose two unsupervised semantic manipulation strategies: one based on vector-disentangled representation and the other on feature map-disentangled representation, taking into consideration the complexity of the latent codes and the smoothness of the reconstructed images. Our empirical evaluations, conducted extensively on real-world image data, showcase the potency of our attacks, particularly against black-box classifiers. Furthermore, we establish the existence of a universal semantic adversarial example that is agnostic to specific images.

摘要

深度神经网络（DNN）对对抗性入侵的敏感性，以对抗性示例为代表，已有充分记录。传统攻击实施无结构的、逐像素的扰动来误导分类器，这通常会导致与自然样本有明显差异，并且缺乏人类可感知的可解释性。在这项工作中，我们提出了一种对抗性攻击策略，该策略实施细粒度的、面向语义意义的结构扰动。我们提出的方法通过使用解缠的潜在代码来操纵图像的语义属性。我们通过操纵单个潜在代码或其组合来设计对抗性扰动。为此，考虑到潜在代码的复杂性和重建图像的平滑性，我们提出了两种无监督语义操纵策略：一种基于向量解缠表示，另一种基于特征图解缠表示。我们在真实世界图像数据上进行了广泛的实证评估，展示了我们攻击的有效性，特别是针对黑盒分类器。此外，我们确定了存在一种对特定图像不可知的通用语义对抗性示例。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

通过潜在空间中的特征操纵生成语义对抗样本。

Generating Semantic Adversarial Examples via Feature Manipulation in Latent Space.

作者信息

出版信息

相似文献

通过潜在空间中的特征操纵生成语义对抗样本。

Generating Semantic Adversarial Examples via Feature Manipulation in Latent Space.

作者信息

出版信息

相似文献