• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多模型保证分析表明,在临床决策支持过程中,大语言模型极易受到对抗性幻觉攻击。

Multi-model assurance analysis showing large language models are highly vulnerable to adversarial hallucination attacks during clinical decision support.

作者信息

Omar Mahmud, Sorin Vera, Collins Jeremy D, Reich David, Freeman Robert, Gavin Nicholas, Charney Alexander, Stump Lisa, Bragazzi Nicola Luigi, Nadkarni Girish N, Klang Eyal

机构信息

The Windreich Department of Artificial Intelligence and Human Health, Mount Sinai Medical Center, New York, NY, USA.

The Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA.

出版信息

Commun Med (Lond). 2025 Aug 2;5(1):330. doi: 10.1038/s43856-025-01021-3.

DOI:10.1038/s43856-025-01021-3
PMID:40753316
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12318031/
Abstract

BACKGROUND

Large language models (LLMs) show promise in clinical contexts but can generate false facts (often referred to as "hallucinations"). One subset of these errors arises from adversarial attacks, in which fabricated details embedded in prompts lead the model to produce or elaborate on the false information. We embedded fabricated content in clinical prompts to elicit adversarial hallucination attacks in multiple large language models. We quantified how often they elaborated on false details and tested whether a specialized mitigation prompt or altered temperature settings reduced errors.

METHODS

We created 300 physician-validated simulated vignettes, each containing one fabricated detail (a laboratory test, a physical or radiological sign, or a medical condition). Each vignette was presented in short and long versions-differing only in word count but identical in medical content. We tested six LLMs under three conditions: default (standard settings), mitigating prompt (designed to reduce hallucinations), and temperature 0 (deterministic output with maximum response certainty), generating 5,400 outputs. If a model elaborated on the fabricated detail, the case was classified as a "hallucination".

RESULTS

Hallucination rates range from 50 % to 82 % across models and prompting methods. Prompt-based mitigation lowers the overall hallucination rate (mean across all models) from 66 % to 44 % (p < 0.001). For the best-performing model, GPT-4o, rates decline from 53 % to 23 % (p < 0.001). Temperature adjustments offer no significant improvement. Short vignettes show slightly higher odds of hallucination.

CONCLUSIONS

LLMs are highly susceptible to adversarial hallucination attacks, frequently generating false clinical details that pose risks when used without safeguards. While prompt engineering reduces errors, it does not eliminate them.

摘要

背景

大语言模型(LLMs)在临床环境中显示出应用前景,但可能会生成虚假事实(通常称为“幻觉”)。这些错误的一个子集源于对抗性攻击,即提示中嵌入的虚假细节会导致模型生成或详细阐述虚假信息。我们在临床提示中嵌入虚假内容,以引发多个大语言模型的对抗性幻觉攻击。我们量化了它们详细阐述虚假细节的频率,并测试了专门的缓解提示或改变温度设置是否能减少错误。

方法

我们创建了300个经医生验证的模拟病例 vignettes,每个病例包含一个虚假细节(一项实验室检查、一个体格检查或影像学体征,或一种疾病状况)。每个病例 vignette 都有简短版和长篇版,仅字数不同,但医学内容相同。我们在三种条件下测试了六个大语言模型:默认(标准设置)、缓解提示(旨在减少幻觉)和温度0(具有最大响应确定性的确定性输出),共生成5400个输出。如果模型详细阐述了虚假细节,则该病例被分类为“幻觉”。

结果

不同模型和提示方法的幻觉率在50%至82%之间。基于提示的缓解措施将总体幻觉率(所有模型的平均值)从66%降至44%(p < 0.001)。对于表现最佳的模型GPT - 4o,幻觉率从53%降至23%(p < 0.001)。温度调整没有显著改善。简短的病例 vignettes 出现幻觉的几率略高。

结论

大语言模型极易受到对抗性幻觉攻击,经常生成虚假的临床细节,在没有安全保障的情况下使用时会带来风险。虽然提示工程可以减少错误,但并不能消除它们。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d72/12318031/0989a1c83c4a/43856_2025_1021_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d72/12318031/d8c615825acc/43856_2025_1021_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d72/12318031/f414cf66b751/43856_2025_1021_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d72/12318031/f8d59facca02/43856_2025_1021_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d72/12318031/9c4f344891a3/43856_2025_1021_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d72/12318031/3487c39e1c50/43856_2025_1021_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d72/12318031/0989a1c83c4a/43856_2025_1021_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d72/12318031/d8c615825acc/43856_2025_1021_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d72/12318031/f414cf66b751/43856_2025_1021_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d72/12318031/f8d59facca02/43856_2025_1021_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d72/12318031/9c4f344891a3/43856_2025_1021_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d72/12318031/3487c39e1c50/43856_2025_1021_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d72/12318031/0989a1c83c4a/43856_2025_1021_Fig6_HTML.jpg

相似文献

1
Multi-model assurance analysis showing large language models are highly vulnerable to adversarial hallucination attacks during clinical decision support.多模型保证分析表明,在临床决策支持过程中,大语言模型极易受到对抗性幻觉攻击。
Commun Med (Lond). 2025 Aug 2;5(1):330. doi: 10.1038/s43856-025-01021-3.
2
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
3
Sexual Harassment and Prevention Training性骚扰与预防培训
4
Improving Large Language Models' Summarization Accuracy by Adding Highlights to Discharge Notes: Comparative Evaluation.通过在出院小结中添加重点内容提高大语言模型的总结准确性:比较评估
JMIR Med Inform. 2025 Jul 24;13:e66476. doi: 10.2196/66476.
5
Short-Term Memory Impairment短期记忆障碍
6
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
7
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施:系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。
Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.
8
Emotional prompting amplifies disinformation generation in AI large language models.情感提示会放大人工智能大语言模型中的虚假信息生成。
Front Artif Intell. 2025 Apr 7;8:1543603. doi: 10.3389/frai.2025.1543603. eCollection 2025.
9
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
10
Intravenous magnesium sulphate and sotalol for prevention of atrial fibrillation after coronary artery bypass surgery: a systematic review and economic evaluation.静脉注射硫酸镁和索他洛尔预防冠状动脉搭桥术后房颤:系统评价与经济学评估
Health Technol Assess. 2008 Jun;12(28):iii-iv, ix-95. doi: 10.3310/hta12280.

引用本文的文献

1
AI Agents in Clinical Medicine: A Systematic Review.临床医学中的人工智能代理:一项系统综述。
medRxiv. 2025 Aug 26:2025.08.22.25334232. doi: 10.1101/2025.08.22.25334232.

本文引用的文献

1
Multimodal LLMs for retinal disease diagnosis via OCT: few-shot versus single-shot learning.通过光学相干断层扫描(OCT)进行视网膜疾病诊断的多模态语言模型:少样本学习与单样本学习
Ther Adv Ophthalmol. 2025 May 20;17:25158414251340569. doi: 10.1177/25158414251340569. eCollection 2025 Jan-Dec.
2
Sociodemographic biases in medical decision making by large language models.大语言模型在医疗决策中的社会人口统计学偏差。
Nat Med. 2025 Apr 7. doi: 10.1038/s41591-025-03626-6.
3
Generating credible referenced medical research: A comparative study of openAI's GPT-4 and Google's gemini.
生成可信的引用医学研究:OpenAI的GPT-4与谷歌的Gemini的比较研究
Comput Biol Med. 2025 Feb;185:109545. doi: 10.1016/j.compbiomed.2024.109545. Epub 2024 Dec 12.
4
Use of Generative AI to Identify Helmet Status Among Patients With Micromobility-Related Injuries From Unstructured Clinical Notes.利用生成式人工智能从非结构化临床记录中识别与微移动相关损伤患者的头盔使用情况。
JAMA Netw Open. 2024 Aug 1;7(8):e2425981. doi: 10.1001/jamanetworkopen.2024.25981.
5
Quantifying the uncertainty of LLM hallucination spreading in complex adaptive social networks.量化大语言模型幻觉在复杂适应性社会网络中传播的不确定性。
Sci Rep. 2024 Jul 16;14(1):16375. doi: 10.1038/s41598-024-66708-4.
6
Detecting hallucinations in large language models using semantic entropy.使用语义熵检测大型语言模型中的幻觉。
Nature. 2024 Jun;630(8017):625-630. doi: 10.1038/s41586-024-07421-0. Epub 2024 Jun 19.
7
Hallucination Rates and Reference Accuracy of ChatGPT and Bard for Systematic Reviews: Comparative Analysis.幻觉发生率和 ChatGPT 与 Bard 用于系统评价的参考准确性:比较分析。
J Med Internet Res. 2024 May 22;26:e53164. doi: 10.2196/53164.
8
Utilizing natural language processing and large language models in the diagnosis and prediction of infectious diseases: A systematic review.利用自然语言处理和大型语言模型诊断和预测传染病:系统评价。
Am J Infect Control. 2024 Sep;52(9):992-1001. doi: 10.1016/j.ajic.2024.03.016. Epub 2024 Apr 6.
9
Using artificial intelligence to create diverse and inclusive medical case vignettes for education.利用人工智能为医学教育创作多样化和包容性的病例。
Br J Clin Pharmacol. 2024 Mar;90(3):640-648. doi: 10.1111/bcp.15977. Epub 2024 Jan 6.
10
The future landscape of large language models in medicine.医学领域大语言模型的未来前景。
Commun Med (Lond). 2023 Oct 10;3(1):141. doi: 10.1038/s43856-023-00370-1.