• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ChatGPT-3.5和ChatGPT-4在回答患者有关股骨髋臼撞击综合征和关节镜髋关节手术的问题时,提供的信息大多是准确的。

ChatGPT-3.5 and -4 provide mostly accurate information when answering patients' questions relating to femoroacetabular impingement syndrome and arthroscopic hip surgery.

作者信息

Slawaska-Eng David, Bourgeault-Gagnon Yoan, Cohen Dan, Pauyo Thierry, Belzile Etienne L, Ayeni Olufemi R

机构信息

Division of Orthopaedic Surgery, Department of Surgery, McMaster University, 1200 Main St West, Hamilton, Ontario, L8N 3Z5, Canada.

Division of Orthopaedic Surgery, McGill University, 845 Rue Sherbrooke O, Montréal, QC H3A 0G4, Canada.

出版信息

J ISAKOS. 2025 Feb;10:100376. doi: 10.1016/j.jisako.2024.100376. Epub 2024 Dec 12.

DOI:10.1016/j.jisako.2024.100376
PMID:39674512
Abstract

OBJECTIVES

This study aimed to evaluate the accuracy of ChatGPT in answering patient questions about femoroacetabular impingement (FAI) and arthroscopic hip surgery, comparing the performance of versions ChatGPT-3.5 (free) and ChatGPT-4 (paid).

METHODS

Twelve frequently asked questions (FAQs) relating to FAI were selected and posed to ChatGPT-3.5 and ChatGPT-4. The responses were assessed for accuracy by three hip arthroscopy surgeons using a four-tier grading system. Statistical analyses included Wilcoxon signed-rank tests and Gwet's AC2 coefficient for interrater agreement corrected for chance and employing quadratic weights.

RESULTS

The median ratings for responses ranged from "excellent not requiring clarification" to "satisfactory requiring moderate clarification." No responses were rated as "unsatisfactory requiring substantial clarification." The median accuracy scores were 2 (range 1-3) for ChatGPT-3.5 and 1.5 (range 1-3) for ChatGPT-4, with 25 ​% of ChatGPT-3.5's responses and 50 ​% of ChatGPT-4's responses rated as "excellent." There was no statistical difference in performance between the two versions (p ​= ​0.279) although ChatGPT-4 showed a tendency towards higher accuracy in some areas. Interrater agreement was substantial for ChatGPT-3.5 (Gwet's AC2 ​= ​0.79 [95% confidence interval (CI) ​= ​0.6-0.94]) and moderate to substantial for ChatGPT-4 (Gwet's AC2 ​= ​0.65 [95% CI ​= ​0.43-0.87]).

CONCLUSION

Both versions of ChatGPT provided mostly accurate responses to FAQs on FAI and arthroscopic surgery, with no significant difference between the versions. The findings suggest potential utility of ChatGPT in patient education, though cautious implementation and further evaluation are recommended due to variability in response accuracy and low power of the study.

LEVEL OF EVIDENCE

IV.

摘要

目的

本研究旨在评估ChatGPT回答患者关于股骨髋臼撞击症(FAI)和关节镜髋关节手术问题的准确性,比较ChatGPT-3.5(免费版)和ChatGPT-4(付费版)的性能。

方法

选择了12个与FAI相关的常见问题(FAQ),并向ChatGPT-3.5和ChatGPT-4提出。三位髋关节镜外科医生使用四级评分系统对回答的准确性进行评估。统计分析包括Wilcoxon符号秩检验和Gwet's AC2系数,用于校正机遇并采用二次权重的评分者间一致性分析。

结果

回答的中位数评分范围从“优秀,无需澄清”到“满意,需要适度澄清”。没有回答被评为“不满意,需要大量澄清”。ChatGPT-3.5的中位数准确性得分为2(范围1 - 3),ChatGPT-4的中位数准确性得分为1.5(范围1 - 3),ChatGPT-3.5的回答中有25%被评为“优秀”,ChatGPT-4的回答中有50%被评为“优秀”。尽管ChatGPT-4在某些方面显示出更高准确性的趋势,但两个版本在性能上没有统计学差异(p = 0.279)。ChatGPT-3.5的评分者间一致性较高(Gwet's AC2 = 0.79 [95%置信区间(CI)= 0.6 - 0.94]),ChatGPT-4的评分者间一致性为中等至高(Gwet's AC2 = 0.65 [95% CI = 0.43 - 0.87])。

结论

两个版本的ChatGPT对FAI和关节镜手术常见问题的回答大多准确,版本之间没有显著差异。研究结果表明ChatGPT在患者教育中具有潜在用途,但由于回答准确性存在差异且研究效能较低,建议谨慎实施并进一步评估。

证据水平

IV级。

相似文献

1
ChatGPT-3.5 and -4 provide mostly accurate information when answering patients' questions relating to femoroacetabular impingement syndrome and arthroscopic hip surgery.ChatGPT-3.5和ChatGPT-4在回答患者有关股骨髋臼撞击综合征和关节镜髋关节手术的问题时,提供的信息大多是准确的。
J ISAKOS. 2025 Feb;10:100376. doi: 10.1016/j.jisako.2024.100376. Epub 2024 Dec 12.
2
ChatGPT Can Often Respond Adequately to Common Patient Questions Regarding Femoroacetabular Impingement.ChatGPT通常能够充分回答患者关于股骨髋臼撞击症的常见问题。
Clin J Sport Med. 2024 Dec 24. doi: 10.1097/JSM.0000000000001327.
3
Using Google web search to analyze and evaluate the application of ChatGPT in femoroacetabular impingement syndrome.利用谷歌网页搜索分析和评估 ChatGPT 在股骨髋臼撞击综合征中的应用。
Front Public Health. 2024 May 31;12:1412063. doi: 10.3389/fpubh.2024.1412063. eCollection 2024.
4
ChatGPT Can Offer At Least Satisfactory Responses to Common Patient Questions Regarding Hip Arthroscopy.ChatGPT至少能对有关髋关节镜检查的常见患者问题给出令人满意的回答。
Arthroscopy. 2025 Jun;41(6):1806-1827. doi: 10.1016/j.arthro.2024.08.036. Epub 2024 Sep 5.
5
Do large language model chatbots perform better than established patient information resources in answering patient questions? A comparative study on melanoma.在回答患者问题方面,大型语言模型聊天机器人的表现是否优于成熟的患者信息资源?一项关于黑色素瘤的比较研究。
Br J Dermatol. 2025 Jan 24;192(2):306-315. doi: 10.1093/bjd/ljae377.
6
Artificial Intelligence Promotes the Dunning Kruger Effect: Evaluating ChatGPT Answers to Frequently Asked Questions About Adolescent Idiopathic Scoliosis.人工智能助长了邓宁-克鲁格效应:评估ChatGPT对青少年特发性脊柱侧凸常见问题的回答
J Am Acad Orthop Surg. 2025 May 1;33(9):473-480. doi: 10.5435/JAAOS-D-24-00297. Epub 2024 Sep 20.
7
Hip Arthroscopic Surgery for Femoroacetabular Impingement With Capsular Management: Factors Associated With Achieving Clinically Significant Outcomes.采用关节囊处理的髋关节镜手术治疗股骨髋臼撞击症:与取得临床显著疗效相关的因素
Am J Sports Med. 2018 Feb;46(2):288-296. doi: 10.1177/0363546517739824. Epub 2017 Nov 21.
8
ChatGPT Provides Satisfactory but Occasionally Inaccurate Answers to Common Patient Hip Arthroscopy Questions.ChatGPT对常见的患者髋关节镜检查问题能提供令人满意但偶尔不准确的答案。
Arthroscopy. 2025 May;41(5):1337-1347. doi: 10.1016/j.arthro.2024.06.017. Epub 2024 Jun 22.
9
Assessing artificial intelligence responses to common patient questions regarding inflatable penile prostheses using a publicly available natural language processing tool (ChatGPT).评估人工智能对常见患者问题的反应,这些问题涉及可充气阴茎假体,使用一个公开可用的自然语言处理工具(ChatGPT)。
Can J Urol. 2024 Jun;31(3):11880-11885.
10
Evaluation of Sexual Function Before and After Hip Arthroscopic Surgery for Symptomatic Femoroacetabular Impingement.有症状的股骨髋臼撞击症患者髋关节镜手术前后性功能评估
Am J Sports Med. 2015 Aug;43(8):1850-6. doi: 10.1177/0363546515584042. Epub 2015 May 12.

引用本文的文献

1
The assessment of ChatGPT-4's performance compared to expert's consensus on chronic lateral ankle instability.与专家共识相比,ChatGPT-4在慢性外侧踝关节不稳方面的性能评估。
J Exp Orthop. 2025 Aug 5;12(3):e70393. doi: 10.1002/jeo2.70393. eCollection 2025 Jul.