在胃肠道手术的抗凝管理中评估有无检索增强生成功能的ChatGPT4。

Assessing ChatGPT4 with and without retrieval-augmented generation in anticoagulation management for gastrointestinal procedures.

作者信息

Malik Sheza, Kharel Himal, Dahiya Dushyant S, Ali Hassam, Blaney Hanna, Singh Achintya, Dhar Jahnvi, Perisetti Abhilash, Facciorusso Antonio, Chandan Saurabh, Mohan Babu P

机构信息

Internal Medicine, Rochester General Hospital, NY, USA (Sheza Malik, Himal Kharel).

Gastroenterology, Hepatology, University of Kansas School of Medicine, Kansas, USA (Dushyant S. Dahiya).

出版信息

Ann Gastroenterol. 2024 Sep-Oct;37(5):514-526. doi: 10.20524/aog.2024.0907. Epub 2024 Aug 19.

DOI:10.20524/aog.2024.0907

PMID:39238788

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11372545/

Abstract

BACKGROUND

In view of the growing complexity of managing anticoagulation for patients undergoing gastrointestinal (GI) procedures, this study evaluated ChatGPT-4's ability to provide accurate medical guidance, comparing it with its prior artificial intelligence (AI) models (ChatGPT-3.5) and the retrieval-augmented generation (RAG)-supported model (ChatGPT4-RAG).

METHODS

Thirty-six anticoagulation-related questions, based on professional guidelines, were answered by ChatGPT-4. Nine gastroenterologists assessed these responses for accuracy and relevance. ChatGPT-4's performance was also compared to that of ChatGPT-3.5 and ChatGPT4-RAG. Additionally, a survey was conducted to understand gastroenterologists' perceptions of ChatGPT-4.

RESULTS

ChatGPT-4's responses showed significantly better accuracy and coherence compared to ChatGPT-3.5, with 30.5% of responses fully accurate and 47.2% generally accurate. ChatGPT4-RAG demonstrated a higher ability to integrate current information, achieving 75% full accuracy. Notably, for diagnostic and therapeutic esophagogastroduodenoscopy, 51.8% of responses were fully accurate; for endoscopic retrograde cholangiopancreatography with and without stent placement, 42.8% were fully accurate; and for diagnostic and therapeutic colonoscopy, 50% were fully accurate.

CONCLUSIONS

ChatGPT4-RAG significantly advances anticoagulation management in endoscopic procedures, offering reliable and precise medical guidance. However, medicolegal considerations mean that a 75% full accuracy rate remains inadequate for independent clinical decision-making. AI may be more appropriately utilized to support and confirm clinicians' decisions, rather than replace them. Further evaluation is essential to maintain patient confidentiality and the integrity of the physician-patient relationship.

摘要

背景

鉴于接受胃肠道（GI）手术患者的抗凝管理日益复杂，本研究评估了ChatGPT-4提供准确医学指导的能力，并将其与之前的人工智能（AI）模型（ChatGPT-3.5）和检索增强生成（RAG）支持的模型（ChatGPT4-RAG）进行比较。

方法

ChatGPT-4回答了基于专业指南的36个抗凝相关问题。九位胃肠病学家评估了这些回答的准确性和相关性。ChatGPT-4的表现也与ChatGPT-3.5和ChatGPT4-RAG进行了比较。此外，还进行了一项调查以了解胃肠病学家对ChatGPT-4的看法。

结果

与ChatGPT-3.5相比，ChatGPT-4的回答显示出明显更高的准确性和连贯性，30.5%的回答完全准确，47.2%的回答总体准确。ChatGPT4-RAG表现出更高的整合当前信息的能力，完全准确的比例达到75%。值得注意的是，对于诊断性和治疗性食管胃十二指肠镜检查，51.8%的回答完全准确；对于有或无支架置入的内镜逆行胰胆管造影，42.8%的回答完全准确；对于诊断性和治疗性结肠镜检查，50%的回答完全准确。

结论

ChatGPT4-RAG在内镜手术的抗凝管理方面取得了显著进展，提供了可靠且精确的医学指导。然而，从法医学角度考虑，75%的完全准确率仍不足以支持独立的临床决策。人工智能可能更适合用于支持和确认临床医生的决策，而不是取代他们。进一步评估对于维护患者隐私和医患关系的完整性至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b46/11372545/0d654994ee80/AnnGastroenterol-37-514-g001.jpg

相似文献

Assessing ChatGPT4 with and without retrieval-augmented generation in anticoagulation management for gastrointestinal procedures.在胃肠道手术的抗凝管理中评估有无检索增强生成功能的ChatGPT4。

Ann Gastroenterol. 2024 Sep-Oct;37(5):514-526. doi: 10.20524/aog.2024.0907. Epub 2024 Aug 19.

Evaluating ChatGPT-4's Diagnostic Accuracy: Impact of Visual Data Integration.评估ChatGPT-4的诊断准确性：视觉数据整合的影响。

JMIR Med Inform. 2024 Apr 9;12:e55627. doi: 10.2196/55627.

Assessing Accuracy of ChatGPT on Addressing Helicobacter pylori Infection-Related Questions: A National Survey and Comparative Study.评估 ChatGPT 在解答与幽门螺杆菌感染相关问题方面的准确性：一项全国性调查和对比研究。

Helicobacter. 2024 Jul-Aug;29(4):e13116. doi: 10.1111/hel.13116.

The evaluation of the performance of ChatGPT in the management of labor analgesia.评估 ChatGPT 在分娩镇痛管理中的性能。

J Clin Anesth. 2024 Nov;98:111582. doi: 10.1016/j.jclinane.2024.111582. Epub 2024 Aug 20.

Evaluating the Efficacy of ChatGPT as a Patient Education Tool in Prostate Cancer: Multimetric Assessment.评估 ChatGPT 在前列腺癌患者教育中的疗效：多指标评估。

J Med Internet Res. 2024 Aug 14;26:e55939. doi: 10.2196/55939.

The Rapid Development of Artificial Intelligence: GPT-4's Performance on Orthopedic Surgery Board Questions.人工智能的快速发展：GPT-4 在骨科手术委员会问题上的表现。

Orthopedics. 2024 Mar-Apr;47(2):e85-e89. doi: 10.3928/01477447-20230922-05. Epub 2023 Sep 27.

Evaluating ChatGPT-4's historical accuracy: a case study on the origins of SWOT analysis.评估ChatGPT-4的历史准确性：以SWOT分析的起源为例

Front Artif Intell. 2024 May 3;7:1402047. doi: 10.3389/frai.2024.1402047. eCollection 2024.

Artificial intelligence and clinical guidance in male reproductive health: ChatGPT4.0's AUA/ASRM guideline compliance evaluation.人工智能与男性生殖健康临床指导：ChatGPT4.0对美国泌尿外科学会/美国生殖医学学会指南的依从性评估

Andrology. 2025 Feb;13(2):176-183. doi: 10.1111/andr.13693. Epub 2024 Jul 17.

Bridging bytes and biopsies: A comparative analysis of ChatGPT and histopathologists in pathology diagnosis and collaborative potential.桥接字节和活检：ChatGPT 与组织病理学家在病理学诊断和协作潜力方面的比较分析。

Histopathology. 2024 Mar;84(4):601-613. doi: 10.1111/his.15100. Epub 2023 Nov 30.

Evaluation of Artificial Intelligence as a Search Tool for Patients: Can ChatGPT-4 Provide Accurate Evidence-Based Orthodontic-Related Information?评估人工智能作为患者搜索工具的效果：ChatGPT-4能否提供准确的循证正畸相关信息？

Cureus. 2024 Jul 31;16(7):e65820. doi: 10.7759/cureus.65820. eCollection 2024 Jul.

引用本文的文献

Evaluating the Reliability of OpenAI's ChatGPT-4 in Providing Pre-colonoscopy Patient Guidance.评估OpenAI的ChatGPT-4在提供结肠镜检查前患者指导方面的可靠性。

Cureus. 2025 Jun 21;17(6):e86512. doi: 10.7759/cureus.86512. eCollection 2025 Jun.

Thyro-GenAI: A Chatbot Using Retrieval-Augmented Generative Models for Personalized Thyroid Disease Management.甲状腺生成式人工智能：一种使用检索增强生成模型进行个性化甲状腺疾病管理的聊天机器人。

J Clin Med. 2025 Apr 3;14(7):2450. doi: 10.3390/jcm14072450.

Improving large language model applications in biomedicine with retrieval-augmented generation: a systematic review, meta-analysis, and clinical development guidelines.利用检索增强生成改进生物医学中的大语言模型应用：一项系统综述、荟萃分析和临床开发指南

J Am Med Inform Assoc. 2025 Apr 1;32(4):605-615. doi: 10.1093/jamia/ocaf008.

本文引用的文献

Transformative Potential of AI in Healthcare: Definitions, Applications, and Navigating the Ethical Landscape and Public Perspectives.人工智能在医疗保健领域的变革潜力：定义、应用以及应对伦理格局和公众观点

Healthcare (Basel). 2024 Jan 5;12(2):125. doi: 10.3390/healthcare12020125.

Pure Wisdom or Potemkin Villages? A Comparison of ChatGPT 3.5 and ChatGPT 4 on USMLE Step 3 Style Questions: Quantitative Analysis.纯粹的智慧还是虚假的村庄？对 USMLE Step 3 题型的 ChatGPT 3.5 和 ChatGPT 4 的比较：定量分析。

JMIR Med Educ. 2024 Jan 5;10:e51148. doi: 10.2196/51148.

Evaluating the role of ChatGPT in gastroenterology: a comprehensive systematic review of applications, benefits, and limitations.评估ChatGPT在胃肠病学中的作用：对其应用、益处及局限性的全面系统综述

Therap Adv Gastroenterol. 2023 Dec 25;16:17562848231218618. doi: 10.1177/17562848231218618. eCollection 2023.

Beyond ChatGPT: What does GPT-4 add to healthcare? The dawn of a new era.超越 ChatGPT：GPT-4 为医疗保健带来了什么？新时代的曙光。

Cardiol J. 2023;30(6):1018-1025. doi: 10.5603/cj.97515. Epub 2023 Oct 13.

Comparison of ChatGPT-3.5, ChatGPT-4, and Orthopaedic Resident Performance on Orthopaedic Assessment Examinations.ChatGPT-3.5、ChatGPT-4 和骨科住院医师在骨科评估考试中的表现比较。

J Am Acad Orthop Surg. 2023 Dec 1;31(23):1173-1179. doi: 10.5435/JAAOS-D-23-00396. Epub 2023 Sep 4.

Evolving Landscape of Large Language Models: An Evaluation of ChatGPT and Bard in Answering Patient Queries on Colonoscopy.大语言模型的发展态势：对ChatGPT和Bard回答结肠镜检查患者问题的评估

Gastroenterology. 2024 Jan;166(1):220-221. doi: 10.1053/j.gastro.2023.08.033. Epub 2023 Aug 26.

Comparative Performance of ChatGPT and Bard in a Text-Based Radiology Knowledge Assessment.ChatGPT 和 Bard 在基于文本的放射学知识评估中的比较性能。

Can Assoc Radiol J. 2024 May;75(2):344-350. doi: 10.1177/08465371231193716. Epub 2023 Aug 14.

Ethical Considerations of Using ChatGPT in Health Care.使用 ChatGPT 在医疗保健中的伦理考虑。

J Med Internet Res. 2023 Aug 11;25:e48009. doi: 10.2196/48009.

The imperative for regulatory oversight of large language models (or generative AI) in healthcare.对医疗保健领域的大语言模型（或生成式人工智能）进行监管监督的必要性。

NPJ Digit Med. 2023 Jul 6;6(1):120. doi: 10.1038/s41746-023-00873-0.

Challenges of Anticoagulant Therapy in Atrial Fibrillation-Focus on Gastrointestinal Bleeding.房颤抗凝治疗面临的挑战-关注胃肠道出血。

Int J Mol Sci. 2023 Apr 7;24(8):6879. doi: 10.3390/ijms24086879.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在胃肠道手术的抗凝管理中评估有无检索增强生成功能的ChatGPT4。

Assessing ChatGPT4 with and without retrieval-augmented generation in anticoagulation management for gastrointestinal procedures.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献