• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

医学领域对大语言模型的对抗攻击。

Adversarial Attacks on Large Language Models in Medicine.

作者信息

Yang Yifan, Jin Qiao, Huang Furong, Lu Zhiyong

机构信息

National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD 20894, USA.

University of Maryland at College Park, Department of Computer Science, College Park, MD 20742, USA.

出版信息

ArXiv. 2024 Dec 16:arXiv:2406.12259v3.

PMID:39398204
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11468488/
Abstract

The integration of Large Language Models (LLMs) into healthcare applications offers promising advancements in medical diagnostics, treatment recommendations, and patient care. However, the susceptibility of LLMs to adversarial attacks poses a significant threat, potentially leading to harmful outcomes in delicate medical contexts. This study investigates the vulnerability of LLMs to two types of adversarial attacks in three medical tasks. Utilizing real-world patient data, we demonstrate that both open-source and proprietary LLMs are vulnerable to malicious manipulation across multiple tasks. We discover that while integrating poisoned data does not markedly degrade overall model performance on medical benchmarks, it can lead to noticeable shifts in fine-tuned model weights, suggesting a potential pathway for detecting and countering model attacks. This research highlights the urgent need for robust security measures and the development of defensive mechanisms to safeguard LLMs in medical applications, to ensure their safe and effective deployment in healthcare settings.

摘要

将大语言模型(LLMs)集成到医疗保健应用中,有望在医学诊断、治疗建议和患者护理方面取得进展。然而,大语言模型易受对抗性攻击的影响,这构成了重大威胁,在微妙的医疗环境中可能导致有害后果。本研究调查了大语言模型在三项医学任务中对两种类型对抗性攻击的脆弱性。利用真实世界的患者数据,我们证明开源和专有大语言模型在多个任务中都容易受到恶意操纵。我们发现,虽然整合中毒数据不会显著降低医学基准上的整体模型性能,但它会导致微调模型权重出现明显变化,这表明存在检测和对抗模型攻击的潜在途径。这项研究强调迫切需要强大的安全措施以及开发防御机制,以保护医疗应用中的大语言模型,确保它们在医疗环境中的安全有效部署。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b49/11665022/ded3c80d25cc/nihpp-2406.12259v3-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b49/11665022/01be6c0ed43e/nihpp-2406.12259v3-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b49/11665022/d27449dab70a/nihpp-2406.12259v3-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b49/11665022/c628ceee7da9/nihpp-2406.12259v3-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b49/11665022/db60b2e244b2/nihpp-2406.12259v3-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b49/11665022/0ce432476c32/nihpp-2406.12259v3-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b49/11665022/ded3c80d25cc/nihpp-2406.12259v3-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b49/11665022/01be6c0ed43e/nihpp-2406.12259v3-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b49/11665022/d27449dab70a/nihpp-2406.12259v3-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b49/11665022/c628ceee7da9/nihpp-2406.12259v3-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b49/11665022/db60b2e244b2/nihpp-2406.12259v3-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b49/11665022/0ce432476c32/nihpp-2406.12259v3-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b49/11665022/ded3c80d25cc/nihpp-2406.12259v3-f0006.jpg

相似文献

1
Adversarial Attacks on Large Language Models in Medicine.医学领域对大语言模型的对抗攻击。
ArXiv. 2024 Dec 16:arXiv:2406.12259v3.
2
Medical large language models are susceptible to targeted misinformation attacks.医学大语言模型容易受到针对性错误信息攻击。
NPJ Digit Med. 2024 Oct 23;7(1):288. doi: 10.1038/s41746-024-01282-7.
3
Utilizing large language models for gastroenterology research: a conceptual framework.利用大语言模型进行胃肠病学研究:一个概念框架。
Therap Adv Gastroenterol. 2025 Apr 1;18:17562848251328577. doi: 10.1177/17562848251328577. eCollection 2025.
4
A dataset and benchmark for hospital course summarization with adapted large language models.一个用于医院病程总结的数据集和基准测试,采用了适配的大语言模型。
J Am Med Inform Assoc. 2025 Mar 1;32(3):470-479. doi: 10.1093/jamia/ocae312.
5
Mitigating adversarial manipulation in LLMs: a prompt-based approach to counter Jailbreak attacks (Prompt-G).减轻大语言模型中的对抗性操纵:一种基于提示的方法来对抗越狱攻击(Prompt-G)。
PeerJ Comput Sci. 2024 Oct 22;10:e2374. doi: 10.7717/peerj-cs.2374. eCollection 2024.
6
Cybersecurity Threats and Mitigation Strategies for Large Language Models in Health Care.医疗保健领域大语言模型的网络安全威胁与缓解策略
Radiol Artif Intell. 2025 Jul;7(4):e240739. doi: 10.1148/ryai.240739.
7
Large Language Models and User Trust: Consequence of Self-Referential Learning Loop and the Deskilling of Health Care Professionals.大语言模型与用户信任:自我参照学习循环的后果及医疗保健专业人员的技能退化
J Med Internet Res. 2024 Apr 25;26:e56764. doi: 10.2196/56764.
8
DeepSeek in Healthcare: Revealing Opportunities and Steering Challenges of a New Open-Source Artificial Intelligence Frontier.医疗保健领域的DeepSeek:揭示新开源人工智能前沿的机遇与导向挑战
Cureus. 2025 Feb 18;17(2):e79221. doi: 10.7759/cureus.79221. eCollection 2025 Feb.
9
Closing the gap between open-source and commercial large language models for medical evidence summarization.缩小用于医学证据总结的开源和商业大语言模型之间的差距。
ArXiv. 2024 Jul 25:arXiv:2408.00588v1.
10
Mobile applications for skin cancer detection are vulnerable to physical camera-based adversarial attacks.用于皮肤癌检测的移动应用程序容易受到基于物理摄像头的对抗性攻击。
Sci Rep. 2025 May 24;15(1):18119. doi: 10.1038/s41598-025-03546-y.

本文引用的文献

1
BadCLM: Backdoor Attack in Clinical Language Models for Electronic Health Records.BadCLM:电子健康记录临床语言模型中的后门攻击
AMIA Annu Symp Proc. 2025 May 22;2024:768-777. eCollection 2024.
2
Matching patients to clinical trials with large language models.利用大型语言模型为患者匹配临床试验。
Nat Commun. 2024 Nov 18;15(1):9074. doi: 10.1038/s41467-024-53081-z.
3
Evaluating the Diagnostic Performance of Large Language Models on Complex Multimodal Medical Cases.评估大型语言模型在复杂多模态医疗案例中的诊断性能。
J Med Internet Res. 2024 May 13;26:e53724. doi: 10.2196/53724.
4
GeneGPT: augmenting large language models with domain tools for improved access to biomedical information.GeneGPT:利用领域工具增强大型语言模型,以改善对生物医学信息的访问。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae075.
5
Opportunities and challenges for ChatGPT and large language models in biomedicine and health.ChatGPT 和大型语言模型在生物医学和健康领域的机遇与挑战。
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad493.
6
A large-scale dataset of patient summaries for retrieval-based clinical decision support systems.基于检索的临床决策支持系统的大型患者摘要数据集。
Sci Data. 2023 Dec 18;10(1):909. doi: 10.1038/s41597-023-02814-8.
7
Exploring the potential utility of AI large language models for medical ethics: an expert panel evaluation of GPT-4.探讨 AI 大型语言模型在医学伦理中的潜在应用:GPT-4 的专家小组评估。
J Med Ethics. 2024 Jan 23;50(2):90-96. doi: 10.1136/jme-2023-109549.
8
Capabilities of GPT-4 in ophthalmology: an analysis of model entropy and progress towards human-level medical question answering.GPT-4 在眼科领域的能力:对模型熵的分析及迈向人类水平医学问答的进展。
Br J Ophthalmol. 2024 Sep 20;108(10):1371-1378. doi: 10.1136/bjo-2023-324438.
9
ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations.医学领域的ChatGPT:其应用、优势、局限性、未来前景及伦理考量概述
Front Artif Intell. 2023 May 4;6:1169595. doi: 10.3389/frai.2023.1169595. eCollection 2023.
10
ChatGPT goes to the operating room: evaluating GPT-4 performance and its potential in surgical education and training in the era of large language models.ChatGPT走进手术室:在大语言模型时代评估GPT-4在外科教育与培训中的表现及其潜力。
Ann Surg Treat Res. 2023 May;104(5):269-273. doi: 10.4174/astr.2023.104.5.269. Epub 2023 Apr 28.