Suppr超能文献

评估ChatGPT在解答成人脊柱畸形手术患者问题时的准确性和可读性。

Evaluating the Accuracy and Readability of ChatGPT in Addressing Patient Queries on Adult Spinal Deformity Surgery.

作者信息

Hernandez Fergui, Guizar Rafael, Avetisian Henry, Abdou Marc A, Karakash William J, Ton Andy, Gallo Matthew C, Ball Jacob R, Wang Jeffrey C, Alluri Ram K, Hah Raymond J, Safaee Michael

机构信息

Department of Orthopaedic Surgery, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA.

Department of Orthopaedic Surgery, University of California, Irvine, CA, USA.

出版信息

Global Spine J. 2025 Jul 11:21925682251360655. doi: 10.1177/21925682251360655.

Abstract

Study DesignCross-Sectional.ObjectivesAdult spinal deformity (ASD) affects 68% of the elderly, with surgical intervention carrying complication rates of up to 50%. Effective patient education is essential for managing expectations, yet high patient volumes can limit preoperative counseling. Language learning models (LLMs), such as ChatGPT, may supplement patient education. This study evaluates ChatGPT-3.5's accuracy and readability in answering common patient questions regarding ASD surgery.MethodsStructured interviews with ASD surgery patients identified 40 common preoperative questions, of which 19 were selected. Each question was posed to ChatGPT-3.5 in separate chat sessions to ensure independent responses. Three spine surgeons assessed response accuracy using a validated 4-point scale (1 = excellent, 4 = unsatisfactory). Readability was analyzed using the Flesch-Kincaid Grade Level formula.ResultsPatient inquiries fell into four themes: (1) Preoperative preparation, (2) Recovery (pain expectations, physical therapy), (3) Lifestyle modifications, and (4) Postoperative course. Accuracy scores varies: Preoperative responses averaged 1.67, Recovery and lifestyle responses 1.33, and postoperative responses 2.0. 59.7% of responses were excellent (no clarification needed), 26.3% were satisfactory (minimal clarification needed), 12.3% required moderate clarification, and 1.8% were unsatisfactory, with one response ("Will my pain return or worsen?") rated inaccurate by all reviewers. Readability analysis showed all 19 responses exceeded the eight-grade reading level by an average of 5.91 grade levels.ConclusionChatGPT-3.5 demonstrates potential as a supplemental patient education tool but provides varying accuracy and complex readability. While it may support patient understanding, the complexity of its responses may limit usefulness for individuals with lower health literacy.

摘要

研究设计

横断面研究。

目的

成人脊柱畸形(ASD)影响68%的老年人,手术干预的并发症发生率高达50%。有效的患者教育对于管理预期至关重要,但患者数量众多可能会限制术前咨询。语言学习模型(LLMs),如ChatGPT,可能会补充患者教育。本研究评估ChatGPT-3.5在回答有关ASD手术的常见患者问题时的准确性和可读性。

方法

对ASD手术患者进行结构化访谈,确定了40个常见的术前问题,从中选择了19个。每个问题在单独的聊天会话中向ChatGPT-3.5提出,以确保独立回答。三位脊柱外科医生使用经过验证的4分制量表(1 = 优秀,4 = 不满意)评估回答的准确性。使用弗莱施-金凯德年级水平公式分析可读性。

结果

患者的询问分为四个主题

(1)术前准备,(2)恢复(疼痛预期、物理治疗),(3)生活方式改变,以及(4)术后过程。准确性得分各不相同:术前回答平均为1.67,恢复和生活方式回答为1.33,术后回答为2.0。59.7%的回答为优秀(无需澄清),26.3%为满意(只需少量澄清),12.3%需要适度澄清,1.8%不满意,有一个回答(“我的疼痛会复发或加重吗?”)被所有评审人员评为不准确。可读性分析表明,所有19个回答平均超出八年级阅读水平5.91个年级。

结论

ChatGPT-3.5展示了作为补充患者教育工具的潜力,但提供的准确性各不相同且可读性复杂。虽然它可能有助于患者理解,但其回答的复杂性可能会限制对健康素养较低的个体的有用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/77f3/12254131/1de4f912052d/10.1177_21925682251360655-fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验