• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估用于BCID2解读与管理的GPT聊天机器人中的思维链提示:人工智能与人类专家相比如何?

Evaluating chain-of-thought prompting in a GPT chatbot for BCID2 interpretation and stewardship: how does AI compare to human experts?

作者信息

Tassone Daniel M, Hitchcock Matthew M, Rossier Connor J, Fletcher Douglas, Ye Julia, Langford Ian, Boatman Julie, Markley J Daniel

机构信息

Division of Infectious Diseases, Department of Medicine, Central Virginia VA Health Care System, Richmond, VA, USA.

Virginia Commonwealth University, School of Pharmacy, Richmond, VA, USA.

出版信息

Antimicrob Steward Healthc Epidemiol. 2025 Jul 11;5(1):e154. doi: 10.1017/ash.2025.10059. eCollection 2025.

DOI:10.1017/ash.2025.10059
PMID:40657035
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12247004/
Abstract

BACKGROUND

Rapid molecular diagnostics, such as the BIOFIRE® Blood Culture Identification 2 (BCID2) panel, have improved the time to pathogen identification in bloodstream infections. However, accurate interpretation and antimicrobial optimization require Infectious Disease (ID) expertise, which may not always be readily available. GPT-powered chatbots could support antimicrobial stewardship programs (ASPs) by assisting non-specialist providers in BCID2 result interpretation and treatment recommendations. This study evaluates the performance of a GPT-4 chatbot compared to ASP prospective audit and feedback interventions.

METHODS

This prospective observational study assessed 43 consecutive real-world cases of bacteremia at a 399-bed VA Medical Center from January to May 2024. The GPT-chatbot utilized "chain-of-thought" prompting and external knowledge integration to generate recommendations. Two independent ID physicians evaluated chatbot and ASP recommendations across four domains: BCID2 interpretation, source control, antibiotic therapy, and additional diagnostic workup. The primary endpoint was the combined rate of harmful or inadequate recommendations. Secondary endpoints assessed the rate of harmful or inadequate responses for each domain.

RESULTS

The chatbot had a significantly higher rate of harmful or inadequate recommendations (13%) compared to ASP (4%, = 0.047). The most significant discrepancy was observed in the domain of antibiotic therapy, where harmful recommendations occurred in up to 10% ( <0.05) of chatbot evaluations. The chatbot performed well in BCID2 interpretation (100% accuracy) but provided more inadequate responses in source control consideration (10% vs. 2% for ASP, = 0.022).

CONCLUSIONS

GPT-powered chatbots show potential for supporting antimicrobial stewardship but should only complement, not replace, human expertise in infectious disease management.

摘要

背景

快速分子诊断技术,如BIOFIRE®血培养鉴定2(BCID2)检测板,已缩短了血流感染中病原体鉴定的时间。然而,准确的解读和抗菌药物优化需要传染病(ID)专业知识,而这种专业知识并非总能随时获取。基于GPT的聊天机器人可以通过协助非专科医生解读BCID2检测结果并提供治疗建议,来支持抗菌药物管理计划(ASP)。本研究评估了与ASP前瞻性审核和反馈干预措施相比,GPT-4聊天机器人的性能。

方法

这项前瞻性观察性研究评估了2024年1月至5月在一家拥有399张床位的退伍军人事务部医疗中心连续收治的43例菌血症实际病例。GPT聊天机器人利用“思维链”提示和外部知识整合来生成建议。两名独立的传染病科医生在四个领域评估了聊天机器人和ASP的建议:BCID2解读、源头控制、抗生素治疗和额外的诊断检查。主要终点是有害或不充分建议的综合发生率。次要终点评估每个领域有害或不充分回复的发生率。

结果

与ASP(4%)相比,聊天机器人有害或不充分建议的发生率显著更高(13%,P = 0.047)。在抗生素治疗领域观察到的差异最为显著,在高达10%(P<0.05)的聊天机器人评估中出现了有害建议。聊天机器人在BCID2解读方面表现良好(准确率100%),但在源头控制考虑方面提供了更多不充分的回复(10%对ASP的2%,P = 0.022)。

结论

基于GPT的聊天机器人在支持抗菌药物管理方面显示出潜力,但在传染病管理中应仅作为补充,而非取代人类专业知识。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c76/12247004/aa159e38d16a/S2732494X25100594_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c76/12247004/f8d739dffebf/S2732494X25100594_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c76/12247004/aa159e38d16a/S2732494X25100594_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c76/12247004/f8d739dffebf/S2732494X25100594_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c76/12247004/aa159e38d16a/S2732494X25100594_fig2.jpg

相似文献

1
Evaluating chain-of-thought prompting in a GPT chatbot for BCID2 interpretation and stewardship: how does AI compare to human experts?评估用于BCID2解读与管理的GPT聊天机器人中的思维链提示:人工智能与人类专家相比如何?
Antimicrob Steward Healthc Epidemiol. 2025 Jul 11;5(1):e154. doi: 10.1017/ash.2025.10059. eCollection 2025.
2
Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗:一项系统综述
Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.
3
Effectiveness of the BioFire FilmArray for the rapid detection of bloodstream infection in haematological patients with febrile neutropenia (the ONFIRE study): study protocol of a prospective, multicentre observational study at three reference university hospitals in Spain.BioFire FilmArray在发热性中性粒细胞减少症血液病患者中快速检测血流感染的有效性(ONFIRE研究):西班牙三家大学附属医院前瞻性多中心观察性研究的研究方案
BMJ Open. 2025 Jun 10;15(6):e101040. doi: 10.1136/bmjopen-2025-101040.
4
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
5
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
6
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施:系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。
Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.
7
Performance of Multimodal Artificial Intelligence Chatbots Evaluated on Clinical Oncology Cases.多模态人工智能聊天机器人在临床肿瘤病例中的性能评估。
JAMA Netw Open. 2024 Oct 1;7(10):e2437711. doi: 10.1001/jamanetworkopen.2024.37711.
8
Chatbots That Deliver Contraceptive Support: Systematic Review.提供避孕支持的聊天机器人:系统评价。
J Med Internet Res. 2024 Feb 27;26:e46758. doi: 10.2196/46758.
9
Interventions to improve antibiotic prescribing practices for hospital inpatients.改善医院住院患者抗生素处方行为的干预措施。
Cochrane Database Syst Rev. 2013 Apr 30(4):CD003543. doi: 10.1002/14651858.CD003543.pub3.
10
[Volume and health outcomes: evidence from systematic reviews and from evaluation of Italian hospital data].[容量与健康结果:来自系统评价和意大利医院数据评估的证据]
Epidemiol Prev. 2013 Mar-Jun;37(2-3 Suppl 2):1-100.

本文引用的文献

1
Let's Have a Chat: How Well Does an Artificial Intelligence Chatbot Answer Clinical Infectious Diseases Pharmacotherapy Questions?畅所欲言:人工智能聊天机器人解答临床感染性疾病药物治疗问题的能力如何?
Open Forum Infect Dis. 2024 Oct 25;11(11):ofae641. doi: 10.1093/ofid/ofae641. eCollection 2024 Nov.
2
Large Language Model Influence on Diagnostic Reasoning: A Randomized Clinical Trial.大语言模型对诊断推理的影响:一项随机临床试验。
JAMA Netw Open. 2024 Oct 1;7(10):e2440969. doi: 10.1001/jamanetworkopen.2024.40969.
3
Rapid Diagnostic Tests and Antimicrobial Stewardship Programs for the Management of Bloodstream Infection: What Is Their Relative Contribution to Improving Clinical Outcomes? A Systematic Review and Network Meta-analysis.
快速诊断检测与抗菌药物管理策略用于血流感染的治疗:它们对改善临床结局的相对贡献是什么?一项系统评价和网络荟萃分析。
Clin Infect Dis. 2024 Aug 16;79(2):502-515. doi: 10.1093/cid/ciae234.
4
Large Language Models in Medicine: The Potentials and Pitfalls : A Narrative Review.医学领域的大型语言模型:潜力与陷阱:一篇叙事性综述。
Ann Intern Med. 2024 Feb;177(2):210-220. doi: 10.7326/M23-2772. Epub 2024 Jan 30.
5
Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine.诊断推理提示揭示了医学中大型语言模型可解释性的潜力。
NPJ Digit Med. 2024 Jan 24;7(1):20. doi: 10.1038/s41746-024-01010-1.
6
Contemporary Management of Staphylococcus aureus Bacteremia-Controversies in Clinical Practice.金黄色葡萄球菌菌血症的当代管理-临床实践中的争议。
Clin Infect Dis. 2023 Nov 30;77(11):e57-e68. doi: 10.1093/cid/ciad500.
7
Can Chatbot Artificial Intelligence Replace Infectious Diseases Physicians in the Management of Bloodstream Infections? A Prospective Cohort Study.人工智能聊天机器人能否在血流感染管理中取代传染病医生?一项前瞻性队列研究。
Clin Infect Dis. 2024 Apr 10;78(4):825-832. doi: 10.1093/cid/ciad632.
8
Accuracy and Reliability of Chatbot Responses to Physician Questions.聊天机器人对医生提问回答的准确性和可靠性。
JAMA Netw Open. 2023 Oct 2;6(10):e2336483. doi: 10.1001/jamanetworkopen.2023.36483.
9
Complexity of Infectious Diseases Compared With Other Medical Subspecialties.与其他医学亚专业相比,传染病的复杂性。
Open Forum Infect Dis. 2023 Sep 8;10(9):ofad463. doi: 10.1093/ofid/ofad463. eCollection 2023 Sep.
10
Bibliographic Research with ChatGPT may be Misleading: The Problem of Hallucination.使用ChatGPT进行文献研究可能会产生误导:幻觉问题。
J Pediatr Surg. 2024 Jan;59(1):158. doi: 10.1016/j.jpedsurg.2023.08.018. Epub 2023 Aug 30.