评估ChatGPT与OrthoInfo相比能否回答有关肩袖撕裂的常见患者问题。

Evaluating if ChatGPT Can Answer Common Patient Questions Compared With OrthoInfo Regarding Rotator Cuff Tears.

作者信息

Jurayj Alexander, Nerys-Figueroa Julio, Espinal Emil, Gaudiani Michael A, Baes Travis, Mahylis Jared, Muh Stephanie

机构信息

From the Department of Orthopaedic Surgery, Henry Ford Hospital, Detroit, MI.

出版信息

J Am Acad Orthop Surg Glob Res Rev. 2025 Mar 11;9(3). doi: 10.5435/JAAOSGlobal-D-24-00289. eCollection 2025 Mar 1.

DOI:10.5435/JAAOSGlobal-D-24-00289

PMID:40080671

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11905972/

Abstract

PURPOSE

To evaluate ChatGPT's (OpenAI) ability to provide accurate, appropriate, and readable responses to common patient questions about rotator cuff tears.

METHODS

Eight questions from the OrthoInfo rotator cuff tear web page were input into ChatGPT at two levels: standard and at a sixth-grade reading level. Five orthopaedic surgeons assessed the accuracy and appropriateness of responses using a Likert scale, and the Flesch-Kincaid Grade Level measured readability. Results were analyzed with a paired Student t-test.

RESULTS

Standard ChatGPT responses scored higher in accuracy (4.7 ± 0.47 vs. 3.6 ± 0.76; P < 0.001) and appropriateness (4.5 ± 0.57 vs. 3.7 ± 0.98; P < 0.001) compared with sixth-grade responses. However, standard ChatGPT responses were less accurate (4.7 ± 0.47 vs. 5.0 ± 0.0; P = 0.004) and appropriate (4.5 ± 0.57 vs. 5.0 ± 0.0; P = 0.016) when compared with OrthoInfo responses. OrthoInfo responses were also notably better than sixth-grade responses in both accuracy and appropriateness (P < 0.001). Standard responses had a higher Flesch-Kincaid grade level compared with both OrthoInfo and sixth-grade responses (P < 0.001).

CONCLUSION

Standard ChatGPT responses were less accurate and appropriate, with worse readability compared with OrthoInfo responses. Despite being easier to read, sixth-grade level ChatGPT responses compromised on accuracy and appropriateness. At this time, ChatGPT is not recommended as a standalone source for patient information on rotator cuff tears but may supplement information provided by orthopaedic surgeons.

摘要

目的

评估ChatGPT（OpenAI）对有关肩袖撕裂的常见患者问题提供准确、恰当且易读回答的能力。

方法

将来自OrthoInfo肩袖撕裂网页的八个问题以两种级别输入ChatGPT：标准级别和六年级阅读级别。五名骨科医生使用李克特量表评估回答的准确性和恰当性，并用弗莱什-金凯德年级水平衡量易读性。结果采用配对学生t检验进行分析。

结果

与六年级水平的回答相比，标准ChatGPT回答在准确性（4.7±0.47对3.6±0.76；P<0.001）和恰当性（4.5±0.57对3.7±0.98；P<0.001）方面得分更高。然而，与OrthoInfo的回答相比，标准ChatGPT回答的准确性（4.7±0.47对5.0±0.0；P = 0.004）和恰当性（4.5±0.57对5.0±0.0；P = 0.016）较低。OrthoInfo的回答在准确性和恰当性方面也明显优于六年级水平的回答（P<0.001）。与OrthoInfo和六年级水平的回答相比，标准回答的弗莱什-金凯德年级水平更高（P<0.001）。

结论

与OrthoInfo的回答相比，标准ChatGPT回答的准确性和恰当性较低，易读性也较差。尽管六年级水平的ChatGPT回答更易读，但在准确性和恰当性方面有所妥协。目前，不建议将ChatGPT作为肩袖撕裂患者信息的独立来源，但可作为骨科医生提供信息的补充。

相似文献

Evaluating if ChatGPT Can Answer Common Patient Questions Compared With OrthoInfo Regarding Rotator Cuff Tears.评估ChatGPT与OrthoInfo相比能否回答有关肩袖撕裂的常见患者问题。

J Am Acad Orthop Surg Glob Res Rev. 2025 Mar 11;9(3). doi: 10.5435/JAAOSGlobal-D-24-00289. eCollection 2025 Mar 1.

American academy of Orthopedic Surgeons' OrthoInfo provides more readable information regarding meniscus injury than ChatGPT-4 while information accuracy is comparable.美国矫形外科医师学会的OrthoInfo在半月板损伤方面提供了比ChatGPT-4更具可读性的信息，而信息准确性相当。

J ISAKOS. 2025 Apr;11:100843. doi: 10.1016/j.jisako.2025.100843. Epub 2025 Feb 21.

American Academy of Orthopaedic Surgeons OrthoInfo provides more readable information regarding rotator cuff injury than ChatGPT.美国矫形外科医师学会的OrthoInfo提供了比ChatGPT更具可读性的关于肩袖损伤的信息。

J ISAKOS. 2025 Feb 12;12:100841. doi: 10.1016/j.jisako.2025.100841.

Chat Generative Pre-Trained Transformer (ChatGPT) - 3.5 Responses Require Advanced Readability for the General Population and May Not Effectively Supplement Patient-Related Information Provided by the Treating Surgeon Regarding Common Questions About Rotator Cuff Repair.聊天生成预训练变换器（ChatGPT）-3.5的回复需要具备较高的易读性才能适用于普通人群，并且可能无法有效补充主治外科医生就肩袖修复常见问题所提供的与患者相关的信息。

Arthroscopy. 2025 Jan;41(1):42-52. doi: 10.1016/j.arthro.2024.05.009. Epub 2024 May 21.

Generative artificial intelligence chatbots may provide appropriate informational responses to common vascular surgery questions by patients.生成式人工智能聊天机器人可能会为患者关于常见血管外科问题提供恰当的信息性回复。

Vascular. 2025 Feb;33(1):229-237. doi: 10.1177/17085381241240550. Epub 2024 Mar 18.

Assessing the Quality and Reliability of ChatGPT's Responses to Radiotherapy-Related Patient Queries: Comparative Study With GPT-3.5 and GPT-4.评估ChatGPT对放疗相关患者问题回答的质量和可靠性：与GPT-3.5和GPT-4的比较研究

JMIR Cancer. 2025 Apr 16;11:e63677. doi: 10.2196/63677.

Evaluating the accuracy and readability of ChatGPT in providing parental guidance for adenoidectomy, tonsillectomy, and ventilation tube insertion surgery.评估 ChatGPT 在提供腺样体切除术、扁桃体切除术和通气管插入手术的家长指导方面的准确性和可读性。

Int J Pediatr Otorhinolaryngol. 2024 Jun;181:111998. doi: 10.1016/j.ijporl.2024.111998. Epub 2024 May 31.

ChatGPT Provides Satisfactory but Occasionally Inaccurate Answers to Common Patient Hip Arthroscopy Questions.ChatGPT对常见的患者髋关节镜检查问题能提供令人满意但偶尔不准确的答案。

Arthroscopy. 2025 May;41(5):1337-1347. doi: 10.1016/j.arthro.2024.06.017. Epub 2024 Jun 22.

Assessing the Clinical Appropriateness and Practical Utility of ChatGPT as an Educational Resource for Patients Considering Minimally Invasive Spine Surgery.评估ChatGPT作为考虑微创脊柱手术患者的教育资源的临床适用性和实际效用。

Cureus. 2024 Oct 8;16(10):e71105. doi: 10.7759/cureus.71105. eCollection 2024 Oct.

The promising role of chatbots in keratorefractive surgery patient education.聊天机器人在角膜屈光手术患者教育中的潜在作用。

J Fr Ophtalmol. 2025 Feb;48(2):104381. doi: 10.1016/j.jfo.2024.104381. Epub 2024 Dec 13.

引用本文的文献

Assessing ChatGPT responses to patient questions on epidural steroid injections: A comparative study of general vs specific queries.评估ChatGPT对患者关于硬膜外类固醇注射问题的回答：一般问题与特定问题的比较研究。

Interv Pain Med. 2025 May 26;4(2):100592. doi: 10.1016/j.inpm.2025.100592. eCollection 2025 Jun.

本文引用的文献

ChatGPT Can Offer Satisfactory Responses to Common Patient Questions Regarding Elbow Ulnar Collateral Ligament Reconstruction.ChatGPT能够对有关肘部尺侧副韧带重建的常见患者问题提供令人满意的回答。

Arthrosc Sports Med Rehabil. 2024 Feb 13;6(2):100893. doi: 10.1016/j.asmr.2024.100893. eCollection 2024 Apr.

Assessing Ability for ChatGPT to Answer Total Knee Arthroplasty-Related Questions.评估 ChatGPT 回答全膝关节置换术相关问题的能力。

J Arthroplasty. 2024 Aug;39(8):2022-2027. doi: 10.1016/j.arth.2024.02.023. Epub 2024 Feb 14.

Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients?ChatGPT 对全髋关节和膝关节置换患者来说是可靠的信息来源吗？

Bone Jt Open. 2024 Feb 15;5(2):139-146. doi: 10.1302/2633-1462.52.BJO-2023-0113.R1.

Evaluation of information from artificial intelligence on rotator cuff repair surgery.人工智能在肩袖修复手术方面信息的评估。

JSES Int. 2023 Oct 21;8(1):53-57. doi: 10.1016/j.jseint.2023.09.009. eCollection 2024 Jan.

ChatGPT in orthopedics: a narrative review exploring the potential of artificial intelligence in orthopedic practice.骨科领域的ChatGPT：一篇叙述性综述，探讨人工智能在骨科实践中的潜力

Front Surg. 2023 Nov 1;10:1284015. doi: 10.3389/fsurg.2023.1284015. eCollection 2023.

Evaluation High-Quality of Information from ChatGPT (Artificial Intelligence-Large Language Model) Artificial Intelligence on Shoulder Stabilization Surgery.评估 ChatGPT（人工智能-大型语言模型）在肩稳定手术方面的信息质量。

Arthroscopy. 2024 Mar;40(3):726-731.e6. doi: 10.1016/j.arthro.2023.07.048. Epub 2023 Aug 9.

Exploring the potential of ChatGPT as a supplementary tool for providing orthopaedic information.探索 ChatGPT 作为提供骨科信息的补充工具的潜力。

Knee Surg Sports Traumatol Arthrosc. 2023 Nov;31(11):5190-5198. doi: 10.1007/s00167-023-07529-2. Epub 2023 Aug 8.

Assessing ChatGPT Responses to Common Patient Questions Regarding Total Hip Arthroplasty.评估 ChatGPT 对全髋关节置换术常见患者问题的回答。

J Bone Joint Surg Am. 2023 Oct 4;105(19):1519-1526. doi: 10.2106/JBJS.23.00209. Epub 2023 Jul 17.

Patient Education Materials Found via Google Search for Shoulder Arthroscopy Are Written at Too-High of a Reading Level.通过谷歌搜索找到的肩部关节镜检查患者教育材料的阅读难度过高。

Arthrosc Sports Med Rehabil. 2022 Jul 7;4(4):e1575-e1579. doi: 10.1016/j.asmr.2022.04.034. eCollection 2022 Aug.

Online Patient Education Materials for Common Sports Injuries Are Written at Too-High of a Reading Level: A Systematic Review.常见运动损伤的在线患者教育材料的阅读水平过高：一项系统评价。

Arthrosc Sports Med Rehabil. 2022 Feb 11;4(3):e861-e875. doi: 10.1016/j.asmr.2021.12.017. eCollection 2022 Jun.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

评估ChatGPT与OrthoInfo相比能否回答有关肩袖撕裂的常见患者问题。

Evaluating if ChatGPT Can Answer Common Patient Questions Compared With OrthoInfo Regarding Rotator Cuff Tears.

作者信息

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献