针对自然语言处理模型的具有高查询效率的硬标签对抗攻击。

Hard label adversarial attack with high query efficiency against NLP models.

作者信息

Qiu Shilin, Liu Qihe, Zhou Shijie, Gou Min, Zeng Yi, Zhang Zhun, Wu Zhewei

机构信息

School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, China.

出版信息

Sci Rep. 2025 Mar 18;15(1):9378. doi: 10.1038/s41598-025-93566-5.

DOI:10.1038/s41598-025-93566-5

PMID:40102502

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11920284/

Abstract

Current black-box adversarial attacks have demonstrated significant efficacy in creating adversarial texts against natural language processing models, exposing potential robustness vulnerabilities of these models. However, present attack techniques exhibit inefficiency due to their failure to account for the query counts needed in the adversarial text generation process, causing a disparity between the existing methodology and the practical adversarial attack scenario. To this end, this work proposes a query-efficient hard-label attack method called QEAttack, which leverages the genetic algorithm to produce persuasive and semantically equivalent adversarial texts relying solely on observing the final predicted label output by the victim model. To reduce query counts, a dual-gradient fusion strategy and a locality sensitive hashing based sentence-level semantic clustering strategy are proposed and applied to the crossover and mutation steps, respectively. Extensive experiments and ablation studies are conducted on three victim models with varying architectures across five benchmark datasets. The results demonstrate that QEAttack consistently achieves high attack success rates with significantly reduced query counts, while maintaining or even enhancing the imperceptibility and quality of generated adversarial texts.

摘要

当前的黑盒对抗攻击已在针对自然语言处理模型创建对抗文本方面展现出显著成效，揭示了这些模型潜在的鲁棒性漏洞。然而，现有的攻击技术效率低下，因为它们未能考虑对抗文本生成过程中所需的查询次数，导致现有方法与实际对抗攻击场景之间存在差异。为此，这项工作提出了一种名为QEAttack的查询高效硬标签攻击方法，该方法利用遗传算法仅通过观察受害模型输出的最终预测标签来生成有说服力且语义等效的对抗文本。为了减少查询次数，分别提出了双梯度融合策略和基于局部敏感哈希的句子级语义聚类策略，并将其应用于交叉和变异步骤。在五个基准数据集上对三种具有不同架构的受害模型进行了广泛的实验和消融研究。结果表明，QEAttack始终能以显著减少的查询次数实现高攻击成功率，同时保持甚至提高生成的对抗文本的不可感知性和质量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b91/11920284/56c94a434365/41598_2025_93566_Fig1_HTML.jpg

相似文献

Hard label adversarial attack with high query efficiency against NLP models.针对自然语言处理模型的具有高查询效率的硬标签对抗攻击。

Sci Rep. 2025 Mar 18;15(1):9378. doi: 10.1038/s41598-025-93566-5.

HyGloadAttack: Hard-label black-box textual adversarial attacks via hybrid optimization.HyGloadAttack：通过混合优化实现的硬标签黑盒文本对抗攻击。

Neural Netw. 2024 Oct;178:106461. doi: 10.1016/j.neunet.2024.106461. Epub 2024 Jun 12.

Query-Efficient Black-Box Adversarial Attack With Customized Iteration and Sampling.基于定制迭代和采样的查询高效黑盒对抗攻击

IEEE Trans Pattern Anal Mach Intell. 2023 Feb;45(2):2226-2245. doi: 10.1109/TPAMI.2022.3169802. Epub 2023 Jan 6.

A Word-Level Adversarial Attack Method Based on Sememes and an Improved Quantum-Behaved Particle Swarm Optimization.一种基于义原和改进量子行为粒子群优化的词级对抗攻击方法

IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):15210-15221. doi: 10.1109/TNNLS.2023.3283308. Epub 2024 Oct 29.

An Optimized Black-Box Adversarial Simulator Attack Based on Meta-Learning.基于元学习的优化黑盒对抗模拟器攻击

Entropy (Basel). 2022 Sep 27;24(10):1377. doi: 10.3390/e24101377.

Strongly concealed adversarial attack against text classification models with limited queries.针对查询次数有限的文本分类模型的强隐蔽对抗攻击。

Neural Netw. 2025 Mar;183:106971. doi: 10.1016/j.neunet.2024.106971. Epub 2024 Nov 30.

Improving the robustness and accuracy of biomedical language models through adversarial training.通过对抗训练提高生物医学语言模型的稳健性和准确性。

J Biomed Inform. 2022 Aug;132:104114. doi: 10.1016/j.jbi.2022.104114. Epub 2022 Jun 15.

Optimizing Latent Variables in Integrating Transfer and Query Based Attack Framework.在集成基于迁移和查询的攻击框架中优化潜在变量

IEEE Trans Pattern Anal Mach Intell. 2025 Jan;47(1):161-171. doi: 10.1109/TPAMI.2024.3461686. Epub 2024 Dec 4.

Enhancing robustness in video recognition models: Sparse adversarial attacks and beyond.增强视频识别模型的鲁棒性：稀疏对抗攻击及其他。

Neural Netw. 2024 Mar;171:127-143. doi: 10.1016/j.neunet.2023.11.056. Epub 2023 Nov 25.

Generalizable Black-Box Adversarial Attack With Meta Learning.基于元学习的可推广黑盒对抗攻击

IEEE Trans Pattern Anal Mach Intell. 2024 Mar;46(3):1804-1818. doi: 10.1109/TPAMI.2022.3194988. Epub 2024 Feb 6.

本文引用的文献

Natural Language Processing for Smart Healthcare.自然语言处理在智慧医疗中的应用。

IEEE Rev Biomed Eng. 2024;17:4-18. doi: 10.1109/RBME.2022.3210270. Epub 2024 Jan 12.

Desiderata for delivering NLP to accelerate healthcare AI advancement and a Mayo Clinic NLP-as-a-service implementation.推动自然语言处理（NLP）以加速医疗人工智能发展的需求以及梅奥诊所的NLP即服务实施。

NPJ Digit Med. 2019 Dec 17;2:130. doi: 10.1038/s41746-019-0208-8. eCollection 2019.

Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding.通过区域嵌入实现文本分类的半监督卷积神经网络。

Adv Neural Inf Process Syst. 2015 Dec;28:919-927.

Long short-term memory.长短期记忆

Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

针对自然语言处理模型的具有高查询效率的硬标签对抗攻击。

Hard label adversarial attack with high query efficiency against NLP models.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献