与成人重建外科医生相比，探究ChatGPT在骨科关节置换问题方面的效能。

Exploring ChatGPT's Efficacy in Orthopaedic Arthroplasty Questions Compared to Adult Reconstruction Surgeons.

作者信息

Nieves-Lopez Benjamin, Wing Clayton, Springer Bryan D, Aziz Keith T

机构信息

University of Puerto Rico, Medical Sciences Campus, San Juan, Puerto Rico.

Department of Orthopedic Surgery, Mayo Clinic Florida, Jacksonville, FL.

出版信息

Arthroplast Today. 2025 Jul 14;34:101772. doi: 10.1016/j.artd.2025.101772. eCollection 2025 Aug.

DOI:10.1016/j.artd.2025.101772

PMID:40697891

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12280885/

Abstract

BACKGROUND

Chat Generative Pre-trained Transformer (ChatGPT) is a language model designed to conduct conversations utilizing extensive data from the internet. Despite its potential, the utility of ChatGPT in orthopaedic surgery, particularly in arthroplasty, is still being investigated. This study assesses ChatGPT's performance on arthroplasty-related questions in comparison to an Adult Reconstruction Fellow and a Senior level attending.

METHODS

A total of 299 questions from the Adult Reconstruction self-assessment on OrthoBullets were evaluated using ChatGPT 4. Performance was analyzed across different question categories and compared with the performance of an Adult Reconstruction Fellow and Senior level attending arthroplasty surgeon with a -square test. Further comparisons were performed to assess ChatGPT's accuracy rate on image-based questions. Statistical significance was set to a value ≤ .05.

RESULTS

ChatGPT achieved a 66.9% accuracy rate compared to 84.3% and 85.3% obtained by the Fellow and Attending, respectively. No significant differences in performance were observed across question categories. ChatGPT demonstrated better results in text-only compared to image-based questions. Although not statistically significant, ChatGPT showed the highest accuracy rate in questions that included both an X-ray and a clinical picture.

CONCLUSIONS

ChatGPT performed inferior to an Adult Reconstruction Fellow and Attending and it provided more accurate answers when prompted with text-only questions. These findings suggest that while ChatGPT can serve as a useful supplementary resource for arthroplasty topics, it cannot substitute for the clinical judgment required in detailed assessments. Further research is necessary to optimize and validate the use of artificial intelligence in medical education and patient care.

摘要

背景

聊天生成预训练变换器（ChatGPT）是一种语言模型，旨在利用来自互联网的大量数据进行对话。尽管具有潜力，但ChatGPT在骨科手术，尤其是关节置换术中的效用仍在研究中。本研究评估了ChatGPT在关节置换相关问题上的表现，并与成人重建研究员和高级主治医师进行了比较。

方法

使用ChatGPT 4对来自OrthoBullets成人重建自我评估的总共299个问题进行了评估。分析了不同问题类别的表现，并通过卡方检验与成人重建研究员和高级关节置换外科主治医师的表现进行了比较。还进行了进一步比较，以评估ChatGPT在基于图像问题上的准确率。统计学显著性设定为p值≤0.05。

结果

ChatGPT的准确率为66.9%，而研究员和主治医师的准确率分别为84.3%和85.3%。在不同问题类别中未观察到表现上的显著差异。与基于图像的问题相比，ChatGPT在纯文本问题上表现更好。尽管无统计学显著性，但ChatGPT在包含X线片和临床图片的问题中显示出最高的准确率。

结论

ChatGPT的表现不如成人重建研究员和主治医师，并且在仅以文本提问时能提供更准确的答案。这些发现表明，虽然ChatGPT可以作为关节置换术主题的有用补充资源，但它不能替代详细评估中所需的临床判断。有必要进行进一步研究，以优化和验证人工智能在医学教育和患者护理中的应用。

相似文献

Exploring ChatGPT's Efficacy in Orthopaedic Arthroplasty Questions Compared to Adult Reconstruction Surgeons.与成人重建外科医生相比，探究ChatGPT在骨科关节置换问题方面的效能。

Arthroplast Today. 2025 Jul 14;34:101772. doi: 10.1016/j.artd.2025.101772. eCollection 2025 Aug.

Can generative artificial intelligence pass the orthopaedic board examination?生成式人工智能能通过骨科医师资格考试吗？

J Orthop. 2023 Nov 5;53:27-33. doi: 10.1016/j.jor.2023.10.026. eCollection 2024 Jul.

Artificial Intelligence in Peripheral Artery Disease Education: A Battle Between ChatGPT and Google Gemini.外周动脉疾病教育中的人工智能：ChatGPT与谷歌Gemini的较量

Cureus. 2025 Jun 1;17(6):e85174. doi: 10.7759/cureus.85174. eCollection 2025 Jun.

Can generative artificial intelligence provide accurate medical advice?: a case of ChatGPT versus Congress of Neurological Surgeons management of acute cervical spine and spinal cord injuries clinical guidelines.生成式人工智能能否提供准确的医疗建议？：以ChatGPT与神经外科医师协会急性颈椎和脊髓损伤临床指南管理为例

Asian Spine J. 2025 Mar 4. doi: 10.31616/asj.2024.0301.

Evaluation of ChatGPT-4 as an Online Outpatient Assistant in Puerperal Mastitis Management: Content Analysis of an Observational Study.评估ChatGPT-4作为产褥期乳腺炎管理在线门诊助手的效果：一项观察性研究的内容分析

JMIR Med Inform. 2025 Jul 24;13:e68980. doi: 10.2196/68980.

Artificial Intelligence in Orthopaedics: Performance of ChatGPT on Text and Image Questions on a Complete AAOS Orthopaedic In-Training Examination (OITE).人工智能在骨科领域的应用：ChatGPT 在 AAOS 骨科住院医师培训考试（OITE）全题文本和图像问题上的表现。

J Surg Educ. 2024 Nov;81(11):1645-1649. doi: 10.1016/j.jsurg.2024.08.002. Epub 2024 Sep 14.

Using Artificial Intelligence ChatGPT to Access Medical Information about Chemical Eye Injuries: A Comparative Study.使用人工智能ChatGPT获取有关化学性眼外伤的医学信息：一项比较研究。

JMIR Form Res. 2025 Jun 30. doi: 10.2196/73642.

[Volume and health outcomes: evidence from systematic reviews and from evaluation of Italian hospital data].[容量与健康结果：来自系统评价和意大利医院数据评估的证据]

Epidemiol Prev. 2013 Mar-Jun;37(2-3 Suppl 2):1-100.

"Dr. AI Will See You Now": How Do ChatGPT-4 Treatment Recommendations Align With Orthopaedic Clinical Practice Guidelines?“AI 医生为您服务”：ChatGPT-4 的治疗建议与骨科临床实践指南如何契合？

Clin Orthop Relat Res. 2024 Dec 1;482(12):2098-2106. doi: 10.1097/CORR.0000000000003234. Epub 2024 Sep 6.

Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis.ChatGPT 在全球医学执照考试不同版本中的表现：系统评价和荟萃分析。

J Med Internet Res. 2024 Jul 25;26:e60807. doi: 10.2196/60807.

本文引用的文献

Evaluating the Evolution of ChatGPT as an Information Resource in Shoulder and Elbow Surgery.评估ChatGPT作为肩肘外科信息资源的发展情况。

Orthopedics. 2025 Mar-Apr;48(2):e69-e74. doi: 10.3928/01477447-20250123-03. Epub 2025 Jan 29.

ChatGPT Can Offer At Least Satisfactory Responses to Common Patient Questions Regarding Hip Arthroscopy.ChatGPT至少能对有关髋关节镜检查的常见患者问题给出令人满意的回答。

Arthroscopy. 2025 Jun;41(6):1806-1827. doi: 10.1016/j.arthro.2024.08.036. Epub 2024 Sep 5.

Accuracy assessment of ChatGPT responses to frequently asked questions regarding anterior cruciate ligament surgery.ChatGPT对前交叉韧带手术常见问题回答的准确性评估

Knee. 2024 Dec;51:84-92. doi: 10.1016/j.knee.2024.08.014. Epub 2024 Sep 5.

Integrating ChatGPT in Orthopedic Education for Medical Undergraduates: Randomized Controlled Trial.将 ChatGPT 融入骨科医学本科生教育：随机对照试验。

J Med Internet Res. 2024 Aug 20;26:e57037. doi: 10.2196/57037.

Understanding How ChatGPT May Become a Clinical Administrative Tool Through an Investigation on the Ability to Answer Common Patient Questions Concerning Ulnar Collateral Ligament Injuries.通过对ChatGPT回答有关尺侧副韧带损伤常见患者问题能力的调查，了解其如何成为临床管理工具。

Orthop J Sports Med. 2024 Jul 31;12(7):23259671241257516. doi: 10.1177/23259671241257516. eCollection 2024 Jul.

Can Large Language Models (LLMs) Predict the Appropriate Treatment of Acute Hip Fractures in Older Adults? Comparing Appropriate Use Criteria With Recommendations From ChatGPT.大语言模型（LLMs）能否预测老年人急性髋部骨折的适当治疗方法？比较适当使用标准与 ChatGPT 的建议

J Am Acad Orthop Surg Glob Res Rev. 2024 Aug 9;8(8). doi: 10.5435/JAAOSGlobal-D-24-00206. eCollection 2024 Aug 1.

Performance of ChatGPT on Solving Orthopedic Board-Style Questions: A Comparative Analysis of ChatGPT 3.5 and ChatGPT 4.ChatGPT 在解决骨科 Board 风格问题方面的表现：ChatGPT 3.5 和 ChatGPT 4 的对比分析

Clin Orthop Surg. 2024 Aug;16(4):669-673. doi: 10.4055/cios23179. Epub 2024 Mar 7.

AI-Generated Graduate Medical Education Content for Total Joint Arthroplasty: Comparing ChatGPT Against Orthopaedic Fellows.用于全关节置换术的人工智能生成的毕业后医学教育内容：将ChatGPT与骨科住院医师进行比较。

Arthroplast Today. 2024 Jun 4;27:101412. doi: 10.1016/j.artd.2024.101412. eCollection 2024 Jun.

The Performance of ChatGPT on the American Society for Surgery of the Hand Self-Assessment Examination.ChatGPT在美国手外科协会自我评估考试中的表现。

Cureus. 2024 Apr 24;16(4):e58950. doi: 10.7759/cureus.58950. eCollection 2024 Apr.

Appropriateness of Frequently Asked Patient Questions Following Total Hip Arthroplasty From ChatGPT Compared to Arthroplasty-Trained Nurses.人工髋关节置换术后患者常问问题的适宜性：ChatGPT 与关节置换训练护士相比。

J Arthroplasty. 2024 Sep;39(9S1):S306-S311. doi: 10.1016/j.arth.2024.04.020. Epub 2024 Apr 16.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验