ChatGPT-4的回答与关于前肩不稳的专家共识声明的相关性有限。

Responses From ChatGPT-4 Show Limited Correlation With Expert Consensus Statement on Anterior Shoulder Instability.

作者信息

Artamonov Alexander, Bachar-Avnieli Ira, Klang Eyal, Lubovsky Omri, Atoun Ehud, Bermant Alexander, Rosinsky Philip J

机构信息

Orthopedic Department, Barzilai Medical Center, Ashkelon, Israel.

Ben-Gurion University, Beer-Sheva, Israel.

出版信息

Arthrosc Sports Med Rehabil. 2024 Mar 5;6(3):100923. doi: 10.1016/j.asmr.2024.100923. eCollection 2024 Jun.

DOI:10.1016/j.asmr.2024.100923

PMID:39006799

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11240044/

Abstract

PURPOSE

To compare the similarity of answers provided by Generative Pretrained Transformer-4 (GPT-4) with those of a consensus statement on diagnosis, nonoperative management, and Bankart repair in anterior shoulder instability (ASI).

METHODS

An expert consensus statement on ASI published by Hurley et al. in 2022 was reviewed and questions laid out to the expert panel were extracted. GPT-4, the subscription version of ChatGPT, was queried using the same set of questions. Answers provided by GPT-4 were compared with those of the expert panel and subjectively rated for similarity by 2 experienced shoulder surgeons. GPT-4 was then used to rate the similarity of its own responses to the consensus statement, classifying them as low, medium, or high. Rates of similarity as classified by the shoulder surgeons and GPT-4 were then compared and interobserver reliability calculated using weighted κ scores.

RESULTS

The degree of similarity between responses of GPT-4 and the ASI consensus statement, as defined by shoulder surgeons, was high in 25.8%, medium in 45.2%, and low 29% of questions. GPT-4 assessed similarity as high in 48.3%, medium in 41.9%, and low 9.7% of questions. Surgeons and GPT-4 reached consensus on the classification of 18 questions (58.1%) and disagreement on 13 questions (41.9%).

CONCLUSIONS

The responses generated by artificial intelligence exhibit limited correlation with an expert statement on the diagnosis and treatment of ASI.

CLINICAL RELEVANCE

As the use of artificial intelligence becomes more prevalent, it is important to understand how closely information resembles content produced by human authors.

摘要

目的

比较生成式预训练变换器4（GPT-4）给出的答案与关于前肩不稳（ASI）的诊断、非手术治疗和Bankart修复的共识声明的答案的相似性。

方法

回顾了Hurley等人于2022年发表的关于ASI的专家共识声明，并提取了向专家小组提出的问题。使用相同的问题集查询ChatGPT的订阅版本GPT-4。将GPT-4给出的答案与专家小组的答案进行比较，并由2名经验丰富的肩部外科医生对相似性进行主观评分。然后使用GPT-4对其自身回答与共识声明的相似性进行评分，将其分为低、中、高。然后比较肩部外科医生和GPT-4分类的相似率，并使用加权κ评分计算观察者间的可靠性。

结果

根据肩部外科医生的定义，GPT-4的回答与ASI共识声明之间的相似程度在25.8%的问题中为高，在45.2%的问题中为中，在29%的问题中为低。GPT-4评估相似性在48.3%的问题中为高，在41.9%的问题中为中，在9.7%的问题中为低。外科医生和GPT-4在18个问题（58.1%）的分类上达成共识，在13个问题（41.9%）上存在分歧。

结论

人工智能生成的回答与关于ASI诊断和治疗的专家声明的相关性有限。

临床相关性

随着人工智能的使用越来越普遍，了解信息与人类作者产生的内容的相似程度很重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3d6/11240044/56959c40a3bf/gr1.jpg

相似文献

Responses From ChatGPT-4 Show Limited Correlation With Expert Consensus Statement on Anterior Shoulder Instability.

Arthrosc Sports Med Rehabil. 2024 Mar 5;6(3):100923. doi: 10.1016/j.asmr.2024.100923. eCollection 2024 Jun.

Quality of Answers of Generative Large Language Models Versus Peer Users for Interpreting Laboratory Test Results for Lay Patients: Evaluation Study.

J Med Internet Res. 2024 Apr 17;26:e56655. doi: 10.2196/56655.

Performance of ChatGPT on the Peruvian National Licensing Medical Examination: Cross-Sectional Study.

JMIR Med Educ. 2023 Sep 28;9:e48039. doi: 10.2196/48039.

Anterior Shoulder Instability Part I-Diagnosis, Nonoperative Management, and Bankart Repair-An International Consensus Statement.

Arthroscopy. 2022 Feb;38(2):214-223.e7. doi: 10.1016/j.arthro.2021.07.022. Epub 2021 Jul 29.

A Comparison of ChatGPT and Expert Consensus Statements on Surgical Site Infection Prevention in High-Risk Paediatric Spine Surgery.

J Pediatr Orthop. 2025 Jan 1;45(1):e72-e75. doi: 10.1097/BPO.0000000000002781. Epub 2024 Aug 30.

A Generative Pretrained Transformer (GPT)-Powered Chatbot as a Simulated Patient to Practice History Taking: Prospective, Mixed Methods Study.

JMIR Med Educ. 2024 Jan 16;10:e53961. doi: 10.2196/53961.

neuroGPT-X: toward a clinic-ready large language model.

J Neurosurg. 2023 Oct 6;140(4):1041-1053. doi: 10.3171/2023.7.JNS23573. Print 2024 Apr 1.

Accuracy of ChatGPT on Medical Questions in the National Medical Licensing Examination in Japan: Evaluation Study.

JMIR Form Res. 2023 Oct 13;7:e48023. doi: 10.2196/48023.

Enhanced Artificial Intelligence Strategies in Renal Oncology: Iterative Optimization and Comparative Analysis of GPT 3.5 Versus 4.0.

Ann Surg Oncol. 2024 Jun;31(6):3887-3893. doi: 10.1245/s10434-024-15107-0. Epub 2024 Mar 12.

Artificial Intelligence-Powered Hand Surgery Consultation: GPT-4 as an Assistant in a Hand Surgery Outpatient Clinic.

J Hand Surg Am. 2024 Nov;49(11):1078-1088. doi: 10.1016/j.jhsa.2024.06.002. Epub 2024 Jul 26.

引用本文的文献

Development of a novel artificial intelligence clinical decision support tool for hand surgery: HandRAG.

J Hand Microsurg. 2025 Jun 11;17(4):100293. doi: 10.1016/j.jham.2025.100293. eCollection 2025 Jul.

A custom ChatGPT can accurately answer questions from an international expert osteotomy consensus statement.

Eur J Orthop Surg Traumatol. 2025 Jun 16;35(1):247. doi: 10.1007/s00590-025-04373-7.

Is it a pediatric orthopaedic urgency or not? Can ChatGPT answer this question?

J Orthop Surg Res. 2025 Jun 4;20(1):567. doi: 10.1186/s13018-025-05981-z.

Can Artificial Intelligence Help Orthopaedic Surgeons in the Conservative Management of Knee Osteoarthritis? A Consensus Analysis.

J Clin Med. 2025 Jan 22;14(3):690. doi: 10.3390/jcm14030690.

本文引用的文献

Exploring the potential of ChatGPT as a supplementary tool for providing orthopaedic information.

Knee Surg Sports Traumatol Arthrosc. 2023 Nov;31(11):5190-5198. doi: 10.1007/s00167-023-07529-2. Epub 2023 Aug 8.

Artificial intelligence and ChatGPT in Orthopaedics and sports medicine.

J Exp Orthop. 2023 Jul 26;10(1):74. doi: 10.1186/s40634-023-00642-8.

Assessing ChatGPT Responses to Common Patient Questions Regarding Total Hip Arthroplasty.

J Bone Joint Surg Am. 2023 Oct 4;105(19):1519-1526. doi: 10.2106/JBJS.23.00209. Epub 2023 Jul 17.

Can Artificial Intelligence Pass the American Board of Orthopaedic Surgery Examination? Orthopaedic Residents Versus ChatGPT.

Clin Orthop Relat Res. 2023 Aug 1;481(8):1623-1630. doi: 10.1097/CORR.0000000000002704. Epub 2023 May 23.

ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns.

Healthcare (Basel). 2023 Mar 19;11(6):887. doi: 10.3390/healthcare11060887.

Comprehensive Review of Shoulder Instability Includes Diagnosis, Nonoperative Management, Bankart, Latarjet, Remplissage, Glenoid Bone-Grafting, Revision Surgery, Rehabilitation and Return to Play, and Clinical Follow-Up.

Arthroscopy. 2022 Feb;38(2):209-210. doi: 10.1016/j.arthro.2021.11.052.

Anterior Shoulder Instability Part I-Diagnosis, Nonoperative Management, and Bankart Repair-An International Consensus Statement.

Arthroscopy. 2022 Feb;38(2):214-223.e7. doi: 10.1016/j.arthro.2021.07.022. Epub 2021 Jul 29.

Diagnosis and Management of Traumatic Anterior Shoulder Instability.

J Am Acad Orthop Surg. 2021 Jan 15;29(2):e51-e61. doi: 10.5435/JAAOS-D-20-00202.

The Bankart repair: past, present, and future.

J Shoulder Elbow Surg. 2020 Dec;29(12):e491-e498. doi: 10.1016/j.jse.2020.06.012. Epub 2020 Jul 2.

Position and duration of immobilization after primary anterior shoulder dislocation: a systematic review and meta-analysis of the literature.

J Bone Joint Surg Am. 2010 Dec 15;92(18):2924-33. doi: 10.2106/JBJS.J.00631.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

ChatGPT-4的回答与关于前肩不稳的专家共识声明的相关性有限。

Responses From ChatGPT-4 Show Limited Correlation With Expert Consensus Statement on Anterior Shoulder Instability.

作者信息

Artamonov Alexander, Bachar-Avnieli Ira, Klang Eyal, Lubovsky Omri, Atoun Ehud, Bermant Alexander, Rosinsky Philip J

机构信息

Orthopedic Department, Barzilai Medical Center, Ashkelon, Israel.

Ben-Gurion University, Beer-Sheva, Israel.