肿瘤学中针对视觉语言模型的提示注入攻击。

Prompt injection attacks on vision language models in oncology.

作者信息

Clusmann Jan, Ferber Dyke, Wiest Isabella C, Schneider Carolin V, Brinker Titus J, Foersch Sebastian, Truhn Daniel, Kather Jakob Nikolas

机构信息

Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany.

Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany.

出版信息

Nat Commun. 2025 Feb 1;16(1):1239. doi: 10.1038/s41467-024-55631-x.

DOI:10.1038/s41467-024-55631-x

PMID:39890777

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11785991/

Abstract

Vision-language artificial intelligence models (VLMs) possess medical knowledge and can be employed in healthcare in numerous ways, including as image interpreters, virtual scribes, and general decision support systems. However, here, we demonstrate that current VLMs applied to medical tasks exhibit a fundamental security flaw: they can be compromised by prompt injection attacks. These can be used to output harmful information just by interacting with the VLM, without any access to its parameters. We perform a quantitative study to evaluate the vulnerabilities to these attacks in four state of the art VLMs: Claude-3 Opus, Claude-3.5 Sonnet, Reka Core, and GPT-4o. Using a set of N = 594 attacks, we show that all of these models are susceptible. Specifically, we show that embedding sub-visual prompts in manifold medical imaging data can cause the model to provide harmful output, and that these prompts are non-obvious to human observers. Thus, our study demonstrates a key vulnerability in medical VLMs which should be mitigated before widespread clinical adoption.

摘要

视觉语言人工智能模型（VLMs）拥有医学知识，可在医疗保健领域以多种方式应用，包括作为图像解释器、虚拟抄写员和通用决策支持系统。然而，在此我们证明，应用于医疗任务的当前VLMs存在一个基本的安全漏洞：它们可能会受到提示注入攻击的影响。仅通过与VLM交互，无需访问其参数，这些攻击就可用于输出有害信息。我们进行了一项定量研究，以评估四种先进VLMs（Claude-3 Opus、Claude-3.5 Sonnet、Reka Core和GPT-4o）对这些攻击的脆弱性。使用一组N = 594次攻击，我们表明所有这些模型都易受攻击。具体而言，我们表明在多种医学成像数据中嵌入亚视觉提示会导致模型提供有害输出，并且这些提示对人类观察者来说并不明显。因此，我们的研究证明了医学VLMs中的一个关键漏洞，在广泛临床应用之前应予以缓解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4052/11785991/f56da87e2c37/41467_2024_55631_Fig1_HTML.jpg

相似文献

Prompt injection attacks on vision language models in oncology.

Nat Commun. 2025 Feb 1;16(1):1239. doi: 10.1038/s41467-024-55631-x.

Diagnostic accuracy of vision-language models on Japanese diagnostic radiology, nuclear medicine, and interventional radiology specialty board examinations.

Jpn J Radiol. 2024 Dec;42(12):1392-1398. doi: 10.1007/s11604-024-01633-0. Epub 2024 Jul 20.

Diagnostic Performance of GPT-4o and Claude 3 Opus in Determining Causes of Death From Medical Histories and Postmortem CT Findings.

Cureus. 2024 Aug 20;16(8):e67306. doi: 10.7759/cureus.67306. eCollection 2024 Aug.

Comparative Performance of Anthropic Claude and OpenAI GPT Models in Basic Radiological Imaging Tasks.

J Med Imaging Radiat Oncol. 2025 Jun;69(4):431-439. doi: 10.1111/1754-9485.13858. Epub 2025 Apr 8.

Diagnostic performance of multimodal large language models in radiological quiz cases: the effects of prompt engineering and input conditions.

Ultrasonography. 2025 May;44(3):220-231. doi: 10.14366/usg.25012. Epub 2025 Mar 11.

Evaluating text and visual diagnostic capabilities of large language models on questions related to the Breast Imaging Reporting and Data System Atlas 5 edition.

Diagn Interv Radiol. 2025 Mar 3;31(2):111-129. doi: 10.4274/dir.2024.242876. Epub 2024 Sep 9.

Accuracy of Large Language Models for Infective Endocarditis Prophylaxis in Dental Procedures.

Int Dent J. 2025 Feb;75(1):206-212. doi: 10.1016/j.identj.2024.09.033. Epub 2024 Oct 12.

Large language models for data extraction from unstructured and semi-structured electronic health records: a multiple model performance evaluation.

BMJ Health Care Inform. 2025 Jan 19;32(1):e101139. doi: 10.1136/bmjhci-2024-101139.

Diagnostic performances of Claude 3 Opus and Claude 3.5 Sonnet from patient history and key images in Radiology's "Diagnosis Please" cases.

Jpn J Radiol. 2024 Dec;42(12):1399-1402. doi: 10.1007/s11604-024-01634-z. Epub 2024 Aug 3.

Evaluation of Advanced Artificial Intelligence Algorithms' Diagnostic Efficacy in Acute Ischemic Stroke: A Comparative Analysis of ChatGPT-4o and Claude 3.5 Sonnet Models.

J Clin Med. 2025 Jan 17;14(2):571. doi: 10.3390/jcm14020571.

引用本文的文献

Large language models for clinical decision support in gastroenterology and hepatology.

Nat Rev Gastroenterol Hepatol. 2025 Aug 22. doi: 10.1038/s41575-025-01108-1.

Prompt injection attacks on vision-language models for surgical decision support.

medRxiv. 2025 Jul 23:2025.07.16.25331645. doi: 10.1101/2025.07.16.25331645.

Computer-aided tumor cell fraction (TCF) estimation by medical students, residents, and pathologists improves inter-observer agreement while highlighting the risk of automation bias.

Virchows Arch. 2025 Jul 4. doi: 10.1007/s00428-025-04163-w.

[Applications, challenges and a trustworthy use of artificial intelligence in public health].

Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz. 2025 Aug;68(8):880-888. doi: 10.1007/s00103-025-04098-2. Epub 2025 Jul 2.

Hallmarks of artificial intelligence contributions to precision oncology.

Nat Cancer. 2025 Mar;6(3):417-431. doi: 10.1038/s43018-025-00917-2. Epub 2025 Mar 7.

本文引用的文献

Evaluating multimodal AI in medical diagnostics.

NPJ Digit Med. 2024 Aug 7;7(1):205. doi: 10.1038/s41746-024-01208-3.

A multimodal generative AI copilot for human pathology.

Nature. 2024 Oct;634(8033):466-473. doi: 10.1038/s41586-024-07618-3. Epub 2024 Jun 12.

Vision-language foundation model for echocardiogram interpretation.

Nat Med. 2024 May;30(5):1481-1488. doi: 10.1038/s41591-024-02959-y. Epub 2024 Apr 30.

Mapping the global geography of cybercrime with the World Cybercrime Index.

PLoS One. 2024 Apr 10;19(4):e0297312. doi: 10.1371/journal.pone.0297312. eCollection 2024.

Comparative Analysis of Multimodal Large Language Model Performance on Clinical Vignette Questions.

JAMA. 2024 Apr 16;331(15):1320-1321. doi: 10.1001/jama.2023.27861.

Adapted large language models can outperform medical experts in clinical text summarization.

Nat Med. 2024 Apr;30(4):1134-1142. doi: 10.1038/s41591-024-02855-5. Epub 2024 Feb 27.

The future landscape of large language models in medicine.

Commun Med (Lond). 2023 Oct 10;3(1):141. doi: 10.1038/s43856-023-00370-1.

Large language models in medicine.

Nat Med. 2023 Aug;29(8):1930-1940. doi: 10.1038/s41591-023-02448-8. Epub 2023 Jul 17.

Large language models encode clinical knowledge.

Nature. 2023 Aug;620(7972):172-180. doi: 10.1038/s41586-023-06291-2. Epub 2023 Jul 12.

The elephant in the room: cybersecurity in healthcare.

J Clin Monit Comput. 2023 Oct;37(5):1123-1132. doi: 10.1007/s10877-023-01013-5. Epub 2023 Apr 24.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

肿瘤学中针对视觉语言模型的提示注入攻击。

Prompt injection attacks on vision language models in oncology.

作者信息

Clusmann Jan, Ferber Dyke, Wiest Isabella C, Schneider Carolin V, Brinker Titus J, Foersch Sebastian, Truhn Daniel, Kather Jakob Nikolas

机构信息

Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany.

Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany.

出版信息

Nat Commun. 2025 Feb 1;16(1):1239. doi: 10.1038/s41467-024-55631-x.

DOI:10.1038/s41467-024-55631-x

PMID:39890777

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11785991/

Abstract

摘要

肿瘤学中针对视觉语言模型的提示注入攻击。

Prompt injection attacks on vision language models in oncology.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

肿瘤学中针对视觉语言模型的提示注入攻击。

Prompt injection attacks on vision language models in oncology.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献