Suppr超能文献

评估ChatGPT与OrthoInfo相比能否回答有关肩袖撕裂的常见患者问题。

Evaluating if ChatGPT Can Answer Common Patient Questions Compared With OrthoInfo Regarding Rotator Cuff Tears.

作者信息

Jurayj Alexander, Nerys-Figueroa Julio, Espinal Emil, Gaudiani Michael A, Baes Travis, Mahylis Jared, Muh Stephanie

机构信息

From the Department of Orthopaedic Surgery, Henry Ford Hospital, Detroit, MI.

出版信息

J Am Acad Orthop Surg Glob Res Rev. 2025 Mar 11;9(3). doi: 10.5435/JAAOSGlobal-D-24-00289. eCollection 2025 Mar 1.

Abstract

PURPOSE

To evaluate ChatGPT's (OpenAI) ability to provide accurate, appropriate, and readable responses to common patient questions about rotator cuff tears.

METHODS

Eight questions from the OrthoInfo rotator cuff tear web page were input into ChatGPT at two levels: standard and at a sixth-grade reading level. Five orthopaedic surgeons assessed the accuracy and appropriateness of responses using a Likert scale, and the Flesch-Kincaid Grade Level measured readability. Results were analyzed with a paired Student t-test.

RESULTS

Standard ChatGPT responses scored higher in accuracy (4.7 ± 0.47 vs. 3.6 ± 0.76; P < 0.001) and appropriateness (4.5 ± 0.57 vs. 3.7 ± 0.98; P < 0.001) compared with sixth-grade responses. However, standard ChatGPT responses were less accurate (4.7 ± 0.47 vs. 5.0 ± 0.0; P = 0.004) and appropriate (4.5 ± 0.57 vs. 5.0 ± 0.0; P = 0.016) when compared with OrthoInfo responses. OrthoInfo responses were also notably better than sixth-grade responses in both accuracy and appropriateness (P < 0.001). Standard responses had a higher Flesch-Kincaid grade level compared with both OrthoInfo and sixth-grade responses (P < 0.001).

CONCLUSION

Standard ChatGPT responses were less accurate and appropriate, with worse readability compared with OrthoInfo responses. Despite being easier to read, sixth-grade level ChatGPT responses compromised on accuracy and appropriateness. At this time, ChatGPT is not recommended as a standalone source for patient information on rotator cuff tears but may supplement information provided by orthopaedic surgeons.

摘要

目的

评估ChatGPT(OpenAI)对有关肩袖撕裂的常见患者问题提供准确、恰当且易读回答的能力。

方法

将来自OrthoInfo肩袖撕裂网页的八个问题以两种级别输入ChatGPT:标准级别和六年级阅读级别。五名骨科医生使用李克特量表评估回答的准确性和恰当性,并用弗莱什-金凯德年级水平衡量易读性。结果采用配对学生t检验进行分析。

结果

与六年级水平的回答相比,标准ChatGPT回答在准确性(4.7±0.47对3.6±0.76;P<0.001)和恰当性(4.5±0.57对3.7±0.98;P<0.001)方面得分更高。然而,与OrthoInfo的回答相比,标准ChatGPT回答的准确性(4.7±0.47对5.0±0.0;P = 0.004)和恰当性(4.5±0.57对5.0±0.0;P = 0.016)较低。OrthoInfo的回答在准确性和恰当性方面也明显优于六年级水平的回答(P<0.001)。与OrthoInfo和六年级水平的回答相比,标准回答的弗莱什-金凯德年级水平更高(P<0.001)。

结论

与OrthoInfo的回答相比,标准ChatGPT回答的准确性和恰当性较低,易读性也较差。尽管六年级水平的ChatGPT回答更易读,但在准确性和恰当性方面有所妥协。目前,不建议将ChatGPT作为肩袖撕裂患者信息的独立来源,但可作为骨科医生提供信息的补充。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验