• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

临床人工智能:教授大型语言模型生成与 GERD 手术管理指南一致的建议。

Clinical artificial intelligence: teaching a large language model to generate recommendations that align with guidelines for the surgical management of GERD.

机构信息

Division of General Surgery, Department of Surgery, McMaster University, Hamilton, ON, Canada.

Ross University School of Medicine, Miramar, FL, USA.

出版信息

Surg Endosc. 2024 Oct;38(10):5668-5677. doi: 10.1007/s00464-024-11155-5. Epub 2024 Aug 12.

DOI:10.1007/s00464-024-11155-5
PMID:39134725
Abstract

BACKGROUND

Large Language Models (LLMs) provide clinical guidance with inconsistent accuracy due to limitations with their training dataset. LLMs are "teachable" through customization. We compared the ability of the generic ChatGPT-4 model and a customized version of ChatGPT-4 to provide recommendations for the surgical management of gastroesophageal reflux disease (GERD) to both surgeons and patients.

METHODS

Sixty patient cases were developed using eligibility criteria from the Society of American Gastrointestinal and Endoscopic Surgeons (SAGES) & United European Gastroenterology (UEG)-European Association of Endoscopic. Surgery (EAES) guidelines for the surgical management of GERD. Standardized prompts were engineered for physicians as the end-user, with separate layperson prompts for patients. A customized GPT was developed to generate recommendations based on guidelines, called the GERD Tool for Surgery (GTS). Both the GTS and generic ChatGPT-4 were queried July 21st, 2024. Model performance was evaluated by comparing responses to SAGES & UEG-EAES guideline recommendations. Outcome data was presented using descriptive statistics including counts and percentages.

RESULTS

The GTS provided accurate recommendations for the surgical management of GERD for 60/60 (100.0%) surgeon inquiries and 40/40 (100.0%) patient inquiries based on guideline recommendations. The Generic ChatGPT-4 model generated accurate guidance for 40/60 (66.7%) surgeon inquiries and 19/40 (47.5%) patient inquiries. The GTS produced recommendations based on the 2021 SAGES & UEG-EAES guidelines on the surgical management of GERD, while the generic ChatGPT-4 model generated guidance without citing evidence to support its recommendations.

CONCLUSION

ChatGPT-4 can be customized to overcome limitations with its training dataset to provide recommendations for the surgical management of GERD with reliable accuracy and consistency. The training of LLM models can be used to help integrate this efficient technology into the creation of robust and accurate information for both surgeons and patients. Prospective data is needed to assess its effectiveness in a pragmatic clinical environment.

摘要

背景

由于其训练数据集的限制,大型语言模型 (LLM) 在提供临床指导时准确性不一致。LLM 可以通过定制来“教授”。我们比较了通用 ChatGPT-4 模型和定制版 ChatGPT-4 为外科医生和患者提供胃食管反流病 (GERD) 手术管理建议的能力。

方法

根据美国胃肠内镜外科医师学会 (SAGES) 和欧洲胃肠病学联合会 (UEG)-欧洲内镜外科学会 (EAES) 指南中 GERD 手术管理的标准,使用纳入标准开发了 60 例患者病例。为医生作为最终用户设计了标准化提示,为患者设计了单独的非专业提示。开发了一个名为 GERD 手术工具 (GTS) 的定制 GPT,根据指南生成建议。2024 年 7 月 21 日,对 GTS 和通用 ChatGPT-4 进行了查询。通过比较对 SAGES & UEG-EAES 指南建议的响应来评估模型性能。使用描述性统计数据(包括计数和百分比)呈现结果数据。

结果

根据指南建议,GTS 为 60/60(100.0%)外科医生查询和 40/40(100.0%)患者查询提供了 GERD 手术管理的准确建议。通用 ChatGPT-4 模型为 40/60(66.7%)外科医生查询和 19/40(47.5%)患者查询生成了准确的指导。GTS 根据 2021 年 SAGES & UEG-EAES 关于 GERD 手术管理的指南提出建议,而通用 ChatGPT-4 模型则在没有引用证据支持其建议的情况下提供指导。

结论

可以对 ChatGPT-4 进行定制,以克服其训练数据集的限制,为 GERD 的手术管理提供可靠且一致的建议。大型语言模型的培训可用于帮助将这种高效技术整合到为外科医生和患者创建强大且准确的信息中。需要前瞻性数据来评估其在实际临床环境中的有效性。

相似文献

1
Clinical artificial intelligence: teaching a large language model to generate recommendations that align with guidelines for the surgical management of GERD.临床人工智能:教授大型语言模型生成与 GERD 手术管理指南一致的建议。
Surg Endosc. 2024 Oct;38(10):5668-5677. doi: 10.1007/s00464-024-11155-5. Epub 2024 Aug 12.
2
"Dr. AI Will See You Now": How Do ChatGPT-4 Treatment Recommendations Align With Orthopaedic Clinical Practice Guidelines?“AI 医生为您服务”:ChatGPT-4 的治疗建议与骨科临床实践指南如何契合?
Clin Orthop Relat Res. 2024 Dec 1;482(12):2098-2106. doi: 10.1097/CORR.0000000000003234. Epub 2024 Sep 6.
3
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
4
Harnessing artificial intelligence in bariatric surgery: comparative analysis of ChatGPT-4, Bing, and Bard in generating clinician-level bariatric surgery recommendations.利用人工智能在减重手术中的应用:ChatGPT-4、Bing 和 Bard 在生成临床医生水平的减重手术建议方面的比较分析。
Surg Obes Relat Dis. 2024 Jul;20(7):603-608. doi: 10.1016/j.soard.2024.03.011. Epub 2024 Mar 24.
5
Large Language Models and Empathy: Systematic Review.大语言模型与同理心:系统综述
J Med Internet Res. 2024 Dec 11;26:e52597. doi: 10.2196/52597.
6
Clinical guidelines and payer policies on fusion for the treatment of chronic low back pain.临床指南和支付方政策对慢性下腰痛融合治疗的影响。
Spine (Phila Pa 1976). 2011 Oct 1;36(21 Suppl):S144-63. doi: 10.1097/BRS.0b013e31822ef5b4.
7
Examining the Role of Large Language Models in Orthopedics: Systematic Review.检查大型语言模型在骨科中的作用:系统评价。
J Med Internet Res. 2024 Nov 15;26:e59607. doi: 10.2196/59607.
8
Stench of Errors or the Shine of Potential: The Challenge of (Ir)Responsible Use of ChatGPT in Speech-Language Pathology.错误的恶臭还是潜力的光辉:言语病理学中(不)负责任地使用ChatGPT的挑战。
Int J Lang Commun Disord. 2025 Jul-Aug;60(4):e70088. doi: 10.1111/1460-6984.70088.
9
Development and Validation of a Large Language Model-Powered Chatbot for Neurosurgery: Mixed Methods Study on Enhancing Perioperative Patient Education.用于神经外科手术的基于大语言模型的聊天机器人的开发与验证:关于加强围手术期患者教育的混合方法研究
J Med Internet Res. 2025 Jul 15;27:e74299. doi: 10.2196/74299.
10
Artificial Intelligence Chatbots in Pediatric Emergencies: A Reliable Lifeline or a Risk?儿科急诊中的人工智能聊天机器人:可靠的生命线还是风险?
Cureus. 2025 Aug 1;17(8):e89234. doi: 10.7759/cureus.89234. eCollection 2025 Aug.

引用本文的文献

1
Artificial intelligence in gastrointestinal surgery: A systematic review.胃肠道手术中的人工智能:一项系统综述。
World J Gastrointest Surg. 2025 Aug 27;17(8):109463. doi: 10.4240/wjgs.v17.i8.109463.
2
Large language models for clinical decision support in gastroenterology and hepatology.用于胃肠病学和肝病学临床决策支持的大语言模型
Nat Rev Gastroenterol Hepatol. 2025 Aug 22. doi: 10.1038/s41575-025-01108-1.
3
Reporting guideline for chatbot health advice studies: the Chatbot Assessment Reporting Tool (CHART) statement.

本文引用的文献

1
Whole-exome sequencing study of opioid dependence offers novel insights into the contributions of exome variants.阿片类药物依赖的全外显子组测序研究为外显子变异的作用提供了新见解。
medRxiv. 2024 Sep 17:2024.09.15.24313713. doi: 10.1101/2024.09.15.24313713.
2
Reply: Refining retrieval and chunking strategies for enhanced clinical reliability of large language models in liver disease.回复:优化检索和分块策略以提高大语言模型在肝病领域的临床可靠性
Hepatology. 2024 Nov 1;80(5):E69-E70. doi: 10.1097/HEP.0000000000000995. Epub 2024 Jun 27.
3
The performance of artificial intelligence chatbot large language models to address skeletal biology and bone health queries.
聊天机器人健康建议研究报告指南:聊天机器人评估报告工具(CHART)声明。
BMJ Med. 2025 Aug 1;4(1):e001632. doi: 10.1136/bmjmed-2025-001632. eCollection 2025.
4
Reporting guideline for chatbot health advice studies: the Chatbot Assessment Reporting Tool (CHART) statement.聊天机器人健康建议研究报告指南:聊天机器人评估报告工具(CHART)声明
Br J Surg. 2025 Aug 1;112(8). doi: 10.1093/bjs/znaf142.
5
Reporting guideline for Chatbot Health Advice studies: the CHART statement.聊天机器人健康建议研究报告指南:CHART声明
BMC Med. 2025 Aug 1;23(1):447. doi: 10.1186/s12916-025-04274-w.
6
The role of guideline organizations in nationwide guideline implementation: a qualitative study.指南制定组织在全国范围内实施指南中的作用:一项定性研究。
Health Res Policy Syst. 2024 Dec 23;22(1):174. doi: 10.1186/s12961-024-01253-0.
人工智能聊天机器人大型语言模型在解决骨骼生物学和骨骼健康问题方面的表现。
J Bone Miner Res. 2024 Mar 22;39(2):106-115. doi: 10.1093/jbmr/zjad007.
4
Oncologic Outcomes in Patients with Residual Upper Tract Urothelial Carcinoma Following Neoadjuvant Chemotherapy.新辅助化疗后残余上尿路尿路上皮癌患者的肿瘤学结局。
Eur Urol Oncol. 2024 Oct;7(5):1061-1068. doi: 10.1016/j.euo.2024.01.010. Epub 2024 Jan 22.
5
Digital Phenotyping for Mood Disorders: Methodology-Oriented Pilot Feasibility Study.用于心境障碍的数字化表型分析:面向方法学的初步可行性研究。
J Med Internet Res. 2023 Dec 29;25:e47006. doi: 10.2196/47006.
6
Deep Learning and Gastric Cancer: Systematic Review of AI-Assisted Endoscopy.深度学习与胃癌:人工智能辅助内镜检查的系统评价
Diagnostics (Basel). 2023 Dec 6;13(24):3613. doi: 10.3390/diagnostics13243613.
7
The future landscape of large language models in medicine.医学领域大语言模型的未来前景。
Commun Med (Lond). 2023 Oct 10;3(1):141. doi: 10.1038/s43856-023-00370-1.
8
Head-to-Head Comparison of ChatGPT Versus Google Search for Medical Knowledge Acquisition.ChatGPT与谷歌搜索在医学知识获取方面的直接比较
Otolaryngol Head Neck Surg. 2024 Jun;170(6):1484-1491. doi: 10.1002/ohn.465. Epub 2023 Aug 2.
9
Large language models in medicine.医学中的大型语言模型。
Nat Med. 2023 Aug;29(8):1930-1940. doi: 10.1038/s41591-023-02448-8. Epub 2023 Jul 17.
10
Evaluation of the Potential Utility of an Artificial Intelligence Chatbot in Gastroesophageal Reflux Disease Management.评估人工智能聊天机器人在胃食管反流病管理中的潜在效用。
Am J Gastroenterol. 2023 Dec 1;118(12):2276-2279. doi: 10.14309/ajg.0000000000002397. Epub 2023 Jul 10.