Suppr超能文献

穿白大褂的机器人:大语言模型会是患者教育的未来吗?一项多中心横断面分析。

Bots in white coats: are large language models the future of patient education? A multicenter cross-sectional analysis.

作者信息

Aghamaliyev Ughur, Karimbayli Javad, Zamparas Athanasios, Bösch Florian, Thomas Michael, Schmidt Thomas, Krautz Christian, Kahlert Christoph, Schölch Sebastian, Angele Martin K, Niess Hanno, Guba Markus O, Werner Jens, Ilmer Matthias, Renz Bernhard W

机构信息

Department of General, Visceral, and Transplant Surgery, Ludwig-Maximilians-University Munich, Munich, Germany.

Division of Molecular Oncology, Centro di Riferimento Oncologico di Aviano (CRO), IRCCS, National Cancer Institute, Aviano, Italy.

出版信息

Int J Surg. 2025 Mar 1;111(3):2376-2384. doi: 10.1097/JS9.0000000000002250.

Abstract

OBJECTIVES

Every year, around 300 million surgeries are conducted worldwide, with an estimated 4.2 million deaths occurring within 30 days after surgery. Adequate patient education is crucial, but often falls short due to the stress patients experience before surgery. Large language models (LLMs) can significantly enhance this process by delivering thorough information and addressing patient concerns that might otherwise go unnoticed.

MATERIAL AND METHODS

This cross-sectional study evaluated Chat Generative Pretrained Transformer-4o's audio-based responses to frequently asked questions (FAQs) regarding six general surgical procedures. Three experienced surgeons and two senior residents formulated seven general and three procedure-specific FAQs for both preoperative and postoperative situations, covering six surgical scenarios (major: pancreatic head resection, rectal resection, total gastrectomy; minor: cholecystectomy, Lichtenstein procedure, hemithyroidectomy). In total, 120 audio responses were generated, transcribed, and assessed by 11 surgeons from 6 different German university hospitals.

RESULTS

ChatGPT-4o demonstrated strong performance, achieving an average score of 4.12/5 for accuracy, 4.46/5 for relevance, and 0.22/5 for potential harm across 120 questions. Postoperative responses surpassed preoperative ones in both accuracy and relevance, while also exhibiting lower potential for harm. Additionally, responses related to minor surgeries were minimal, but significantly more accurate compared to those for major surgeries.

CONCLUSIONS

This study underscores GPT-4o's potential to enhance patient education both before and after surgery by delivering accurate and relevant responses to FAQs about various surgical procedures. Responses regarding the postoperative course proved to be more accurate and less harmful than those addressing preoperative ones. Although a few responses carried moderate risks, the overall performance was robust, indicating GPT-4o's value in patient education. The study suggests the development of hospital-specific applications or the integration of GPT-4o into interactive robotic systems to provide patients with reliable, immediate answers, thereby improving patient satisfaction and informed decision-making.

摘要

目的

全球每年约进行3亿台手术,估计有420万人在术后30天内死亡。充分的患者教育至关重要,但由于患者在手术前经历的压力,往往难以做到。大语言模型(LLMs)可以通过提供全面信息并解决患者可能忽略的问题,显著改善这一过程。

材料与方法

本横断面研究评估了Chat Generative Pretrained Transformer-4o针对六种普通外科手术常见问题(FAQs)基于音频的回答。三名经验丰富的外科医生和两名高级住院医师针对术前和术后情况制定了七个一般性和三个特定手术的常见问题,涵盖六种手术场景(大型手术:胰头切除术、直肠切除术、全胃切除术;小型手术:胆囊切除术、利氏手术、甲状腺半切除术)。总共生成了120个音频回复,由来自6家不同德国大学医院的11名外科医生进行转录和评估。

结果

ChatGPT-4o表现出色,在120个问题上,准确性平均得分为4.12/5,相关性平均得分为4.46/5,潜在危害平均得分为0.22/5。术后回复在准确性和相关性方面均超过术前回复,同时潜在危害也更低。此外,与小型手术相关的回复较少,但与大型手术的回复相比准确性显著更高。

结论

本研究强调了GPT-4o通过对各种手术常见问题提供准确和相关的回复,在术前和术后增强患者教育的潜力。关于术后过程的回复被证明比术前回复更准确且危害更小。尽管一些回复存在中度风险,但总体表现稳健,表明GPT-4o在患者教育中的价值。该研究建议开发医院特定应用或将GPT-4o集成到交互式机器人系统中,为患者提供可靠、即时的答案,从而提高患者满意度和知情决策能力。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验