Suppr超能文献

评估与OrthoInfo相比,ChatGPT能否回答有关外侧上髁炎的常见患者问题。

Evaluating if ChatGPT Can Answer Common Patient Questions Compared to OrthoInfo Regarding Lateral Epicondylitis.

作者信息

Espinal Emil, Jurayj Alexander, Nerys-Figueroa Julio, Gaudiani Michael A, Baes Travis, Mahylis Jared, Muh Stephanie

机构信息

Department of Orthopaedic Surgery, Henry Ford Hospital, Detroit, Michigan, USA.

出版信息

Iowa Orthop J. 2025;45(1):19-32.

Abstract

BACKGROUND

As online medical resources become more accessible, patients increasingly consult AI platforms like ChatGPT for health-related information. Our study assessed the accuracy and appropriateness of ChatGPT's responses to common questions about lateral epicondylitis, comparing them against OrthoInfo as a gold standard.

METHODS

Eight frequently asked questions about lateral epicondylitis from OrthoInfo were selected and presented to ChatGPT at both standard and sixth-grade reading levels. Responses were evaluated for accuracy and appropriateness using a five-point Likert scale, with scores of four or above deemed satisfactory. Evaluations were conducted by two fellowship-trained Shoulder and Elbow surgeons, two Hand surgeons, and one Orthopaedic Sports fellow. We utilized the Flesch-Kincaid test to assess readability, and responses were statistically analyzed using paired t-tests.

RESULTS

ChatGPT's responses at the sixth-grade level scored lower in accuracy (mean = 3.9 ± 0.87, p = 0.046) and appropriateness (mean = 3.7 ± 0.92, p = 0.045) compared to the standard level (accuracy = 4.7 ± 0.43, appropriateness = 4.7 ± 0.45). When compared with OrthoInfo, standard responses from ChatGPT showed significantly lower accuracy (mean difference = -0.275, p = 0.004) and appropriateness (mean difference = -0.475, p = 0.016). The Flesch-Kincaid grade level was significantly higher in the standard response group (mean = 14.06, p < 0.001) compared to both OrthoInfo (mean = 8.98) and the sixth-grade responses (mean = 8.48). No significance was noted between the Flesch-Kincaid grades of OrthoInfo and the sixth-grade responses.

CONCLUSION

At a sixth-grade reading level, Chat-GPT provides oversimplified and less accurate information regarding lateral epicondylitis. Although standard level responses are more accurate, they still do not meet the reliability of OrthoInfo and exceed the recommended readability for patient education materials. While ChatGPT cannot be recommended as a sole information source, it may serve as a supplementary resource alongside professional medical consultation. .

摘要

背景

随着在线医疗资源越来越容易获取,患者越来越多地向ChatGPT等人工智能平台咨询健康相关信息。我们的研究评估了ChatGPT对关于外侧上髁炎常见问题的回答的准确性和恰当性,并将其与作为金标准的OrthoInfo进行比较。

方法

从OrthoInfo中选取了八个关于外侧上髁炎的常见问题,并以标准阅读水平和六年级阅读水平呈现给ChatGPT。使用五点李克特量表对回答的准确性和恰当性进行评估,得分在四分及以上被视为满意。评估由两名接受过专科培训的肩肘外科医生、两名手外科医生和一名骨科运动专科医生进行。我们使用弗莱施-金凯德测试来评估可读性,并使用配对t检验对回答进行统计分析。

结果

与标准水平相比,ChatGPT六年级水平的回答在准确性(平均值=3.9±0.87,p=0.046)和恰当性(平均值=3.7±0.92,p=0.045)方面得分较低(标准水平的准确性=4.7±0.43,恰当性=4.7±0.45)。与OrthoInfo相比,ChatGPT的标准回答在准确性(平均差异=-0.275,p=0.004)和恰当性(平均差异=-0.475,p=0.016)方面显著较低。标准回答组的弗莱施-金凯德年级水平显著高于OrthoInfo(平均值=8.98)和六年级回答(平均值=8.48)(平均值=14.06,p<0.001)。OrthoInfo和六年级回答的弗莱施-金凯德年级之间未发现显著差异。

结论

在六年级阅读水平下,ChatGPT提供的关于外侧上髁炎的信息过于简化且准确性较低。虽然标准水平的回答更准确,但它们仍未达到OrthoInfo的可靠性,并且超出了患者教育材料推荐的可读性。虽然不能推荐ChatGPT作为唯一的信息来源,但它可以作为专业医疗咨询的补充资源。

相似文献

4
Is Information About Musculoskeletal Malignancies From Large Language Models or Web Resources at a Suitable Reading Level for Patients?
Clin Orthop Relat Res. 2025 Feb 1;483(2):306-315. doi: 10.1097/CORR.0000000000003263. Epub 2024 Sep 25.
6
Can Artificial Intelligence Improve the Readability of Patient Education Materials?
Clin Orthop Relat Res. 2023 Nov 1;481(11):2260-2267. doi: 10.1097/CORR.0000000000002668. Epub 2023 Apr 28.
7
Evaluating if ChatGPT Can Answer Common Patient Questions Compared With OrthoInfo Regarding Rotator Cuff Tears.
J Am Acad Orthop Surg Glob Res Rev. 2025 Mar 11;9(3). doi: 10.5435/JAAOSGlobal-D-24-00289. eCollection 2025 Mar 1.
8
Artificial Intelligence in Peripheral Artery Disease Education: A Battle Between ChatGPT and Google Gemini.
Cureus. 2025 Jun 1;17(6):e85174. doi: 10.7759/cureus.85174. eCollection 2025 Jun.
9
Chat Generative Pretraining Transformer Answers Patient-focused Questions in Cervical Spine Surgery.
Clin Spine Surg. 2024 Jul 1;37(6):E278-E281. doi: 10.1097/BSD.0000000000001600. Epub 2024 Mar 21.

本文引用的文献

1
ChatGPT Can Offer Satisfactory Responses to Common Patient Questions Regarding Elbow Ulnar Collateral Ligament Reconstruction.
Arthrosc Sports Med Rehabil. 2024 Feb 13;6(2):100893. doi: 10.1016/j.asmr.2024.100893. eCollection 2024 Apr.
2
Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients?
Bone Jt Open. 2024 Feb 15;5(2):139-146. doi: 10.1302/2633-1462.52.BJO-2023-0113.R1.
3
Evaluation of information from artificial intelligence on rotator cuff repair surgery.
JSES Int. 2023 Oct 21;8(1):53-57. doi: 10.1016/j.jseint.2023.09.009. eCollection 2024 Jan.
4
Google Search Analytics for Lateral Epicondylitis.
Hand (N Y). 2025 Jan;20(1):32-36. doi: 10.1177/15589447231199799. Epub 2023 Sep 25.
6
Exploring the potential of ChatGPT as a supplementary tool for providing orthopaedic information.
Knee Surg Sports Traumatol Arthrosc. 2023 Nov;31(11):5190-5198. doi: 10.1007/s00167-023-07529-2. Epub 2023 Aug 8.
7
Assessing ChatGPT Responses to Common Patient Questions Regarding Total Hip Arthroplasty.
J Bone Joint Surg Am. 2023 Oct 4;105(19):1519-1526. doi: 10.2106/JBJS.23.00209. Epub 2023 Jul 17.
8
Patient Education Materials Found via Google Search for Shoulder Arthroscopy Are Written at Too-High of a Reading Level.
Arthrosc Sports Med Rehabil. 2022 Jul 7;4(4):e1575-e1579. doi: 10.1016/j.asmr.2022.04.034. eCollection 2022 Aug.
9
Online Patient Education Materials for Common Sports Injuries Are Written at Too-High of a Reading Level: A Systematic Review.
Arthrosc Sports Med Rehabil. 2022 Feb 11;4(3):e861-e875. doi: 10.1016/j.asmr.2021.12.017. eCollection 2022 Jun.
10
The Quality and Content of Internet-Based Information on Orthopaedic Sports Medicine Requires Improvement: A Systematic Review.
Arthrosc Sports Med Rehabil. 2021 Jul 17;3(5):e1547-e1555. doi: 10.1016/j.asmr.2021.05.007. eCollection 2021 Oct.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验