• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Exploring the role of artificial intelligence in Turkish orthopedic progression exams.探索人工智能在土耳其骨科进展考试中的作用。
Acta Orthop Traumatol Turc. 2025 Mar 17;59(1):18-26. doi: 10.5152/j.aott.2025.24090.
2
Gemini AI vs. ChatGPT: A comprehensive examination alongside ophthalmology residents in medical knowledge.Gemini人工智能与ChatGPT对比:与眼科住院医师一起对医学知识进行的全面考察
Graefes Arch Clin Exp Ophthalmol. 2025 Feb;263(2):527-536. doi: 10.1007/s00417-024-06625-4. Epub 2024 Sep 15.
3
Can ChatGPT pass the Turkish Orthopedics and Traumatology Board Examination? Turkish orthopedic surgeons versus artificial intelligence.ChatGPT能通过土耳其骨科学与创伤外科学委员会考试吗?土耳其骨科医生与人工智能的较量。
Ulus Travma Acil Cerrahi Derg. 2025 Mar;31(3):310-315. doi: 10.14744/tjtes.2025.07724.
4
Performance of artificial intelligence on Turkish dental specialization exam: can ChatGPT-4.0 and gemini advanced achieve comparable results to humans?人工智能在土耳其牙科专业考试中的表现:ChatGPT-4.0和Gemini Advanced能否取得与人类相当的成绩?
BMC Med Educ. 2025 Feb 10;25(1):214. doi: 10.1186/s12909-024-06389-9.
5
Performance of three artificial intelligence (AI)-based large language models in standardized testing; implications for AI-assisted dental education.三种基于人工智能(AI)的大语言模型在标准化测试中的表现;对人工智能辅助牙科教育的启示。
J Periodontal Res. 2025 Feb;60(2):121-133. doi: 10.1111/jre.13323. Epub 2024 Jul 18.
6
Can Artificial Intelligence Pass the American Board of Orthopaedic Surgery Examination? Orthopaedic Residents Versus ChatGPT.人工智能能通过美国骨科医师学会考试吗?骨科住院医师与ChatGPT的对比。
Clin Orthop Relat Res. 2023 Aug 1;481(8):1623-1630. doi: 10.1097/CORR.0000000000002704. Epub 2023 May 23.
7
Evaluating ChatGPT and Google Gemini Performance and Implications in Turkish Dental Education.评估ChatGPT和谷歌Gemini在土耳其牙科教育中的性能及影响
Cureus. 2025 Jan 11;17(1):e77292. doi: 10.7759/cureus.77292. eCollection 2025 Jan.
8
Artificial Intelligence in Orthopaedics: Performance of ChatGPT on Text and Image Questions on a Complete AAOS Orthopaedic In-Training Examination (OITE).人工智能在骨科领域的应用:ChatGPT 在 AAOS 骨科住院医师培训考试(OITE)全题文本和图像问题上的表现。
J Surg Educ. 2024 Nov;81(11):1645-1649. doi: 10.1016/j.jsurg.2024.08.002. Epub 2024 Sep 14.
9
The role of artificial intelligence in medical education: an evaluation of Large Language Models (LLMs) on the Turkish Medical Specialty Training Entrance Exam.人工智能在医学教育中的作用:对土耳其医学专科培训入学考试中大型语言模型的评估
BMC Med Educ. 2025 Apr 25;25(1):609. doi: 10.1186/s12909-025-07148-0.
10
AI-generated questions for urological competency assessment: a prospective educational study.用于泌尿外科能力评估的人工智能生成问题:一项前瞻性教育研究。
BMC Med Educ. 2025 Apr 25;25(1):611. doi: 10.1186/s12909-025-07202-x.

引用本文的文献

1
Correspondence on "exploring the role of artificial intelligence in turkish orthopedic progression exams".关于“探索人工智能在土耳其骨科进展考试中的作用”的通信
Acta Orthop Traumatol Turc. 2025 Jul 18;59(4):230-231. doi: 10.5152/j.aott.2025.25418.

本文引用的文献

1
Can generative artificial intelligence pass the orthopaedic board examination?生成式人工智能能通过骨科医师资格考试吗?
J Orthop. 2023 Nov 5;53:27-33. doi: 10.1016/j.jor.2023.10.026. eCollection 2024 Jul.
2
Evaluating Large Language Models for the National Premedical Exam in India: Comparative Analysis of GPT-3.5, GPT-4, and Bard.评估印度全国医预考用大型语言模型:GPT-3.5、GPT-4 和 Bard 的比较分析。
JMIR Med Educ. 2024 Feb 21;10:e51523. doi: 10.2196/51523.
3
The performance of ChatGPT on orthopaedic in-service training exams: A comparative study of the GPT-3.5 turbo and GPT-4 models in orthopaedic education.ChatGPT在骨科在职培训考试中的表现:GPT-3.5 turbo和GPT-4模型在骨科教育中的比较研究。
J Orthop. 2023 Nov 23;50:70-75. doi: 10.1016/j.jor.2023.11.056. eCollection 2024 Apr.
4
Performance of artificial intelligence chatbots in sleep medicine certification board exams: ChatGPT versus Google Bard.人工智能聊天机器人在睡眠医学认证委员会考试中的表现:ChatGPT 与 Google Bard 对比。
Eur Arch Otorhinolaryngol. 2024 Apr;281(4):2137-2143. doi: 10.1007/s00405-023-08381-3. Epub 2023 Dec 20.
5
Does Google's Bard Chatbot perform better than ChatGPT on the European hand surgery exam?谷歌的巴德(Bard)聊天机器人在欧洲手外科考试中比 ChatGPT 表现更好吗?
Int Orthop. 2024 Jan;48(1):151-158. doi: 10.1007/s00264-023-06034-y. Epub 2023 Nov 15.
6
The future landscape of large language models in medicine.医学领域大语言模型的未来前景。
Commun Med (Lond). 2023 Oct 10;3(1):141. doi: 10.1038/s43856-023-00370-1.
7
The Rapid Development of Artificial Intelligence: GPT-4's Performance on Orthopedic Surgery Board Questions.人工智能的快速发展:GPT-4 在骨科手术委员会问题上的表现。
Orthopedics. 2024 Mar-Apr;47(2):e85-e89. doi: 10.3928/01477447-20230922-05. Epub 2023 Sep 27.
8
ChatGPT performance in the medical specialty exam: An observational study.ChatGPT 在医学专业考试中的表现:一项观察性研究。
Medicine (Baltimore). 2023 Aug 11;102(32):e34673. doi: 10.1097/MD.0000000000034673.
9
Exploring the potential of ChatGPT as a supplementary tool for providing orthopaedic information.探索 ChatGPT 作为提供骨科信息的补充工具的潜力。
Knee Surg Sports Traumatol Arthrosc. 2023 Nov;31(11):5190-5198. doi: 10.1007/s00167-023-07529-2. Epub 2023 Aug 8.
10
Assessing ChatGPT's ability to pass the FRCS orthopaedic part A exam: A critical analysis.评估 ChatGPT 通过 FRCS 骨科 A 部分考试的能力:批判性分析。
Surgeon. 2023 Oct;21(5):263-266. doi: 10.1016/j.surge.2023.07.001. Epub 2023 Jul 28.

探索人工智能在土耳其骨科进展考试中的作用。

Exploring the role of artificial intelligence in Turkish orthopedic progression exams.

作者信息

Ayik Gokhan, Kolac Ulas Can, Aksoy Taha, Yilmaz Abdurrahman, Sili Mazlum Veysel, Tokgozoglu Mazhar, Huri Gazi

机构信息

Department of Orthopedics and Traumatology, Yuksek Ihtisas University Faculty of Medicine, Ankara, Türkiye.

Department of Orthopedics and Traumatology, Hacettepe University Faculty Of Medicine, Ankara, Türkiye.

出版信息

Acta Orthop Traumatol Turc. 2025 Mar 17;59(1):18-26. doi: 10.5152/j.aott.2025.24090.

DOI:10.5152/j.aott.2025.24090
PMID:40337975
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11992947/
Abstract

OBJECTIVE

The aim of this study was to evaluate and compare the performance of the artificial intelligence (AI) models ChatGPT-3.5, ChatGPT-4, and Gemini on the Turkish Specialization Training and Development Examination (UEGS) to determine their utility in medical education and their potential to improve patient care.

METHODS

This retrospective study analyzed responses of ChatGPT-3.5, ChatGPT-4, and Gemini to 1000 true or false questions from UEGS administered over 5 years (2018-2023). Questions, encompassing 9 orthopedic subspecialties, were categorized by 2 independent residents, with discrepancies resolved by a senior author. Artificial intelligence models were restarted for each query to prevent data retention. Performance was evaluated by calculating net scores and comparing them to orthopedic resident scores obtained from the Turkish Orthopedics and Traumatology Education Council (TOTEK) database. Statistical analyses included chi-squared tests, Bonferroni-adjusted Z tests, Cochran's Q test, and receiver operating characteristic (ROC) analysis to determine the optimal question length for AI accuracy. All AI responses were generated independently without retaining prior information.

RESULTS

Significant di!erences in AI tool accuracy were observed across di!erent years and subspecialties (P < .001). ChatGPT-4 consistently outperformed other models, achieving the highest overall accuracy (95% in specific subspecialties). Notably, ChatGPT-4 demonstrated superior performance in Basic and General Orthopedics and Foot and Ankle Surgery, while Gemini and ChatGPT-3.5 showed variability in accuracy across topics and years. Receiver operating characteristic analysis revealed a significant relationship between shorter letter counts and higher accuracy for ChatGPT-4 (P=.002). ChatGPT-4 showed significant negative correlations between letter count and accuracy across all years (r="0.099, P=.002), outperformed residents in basic and general orthopedics (P=.015) and trauma (P=.012), unlike other AI models.

CONCLUSION

The findings underscore the advancing role of AI in the medical field, with ChatGPT-4 demonstrating significant potential as a tool for medical education and clinical decision-making. Continuous evaluation and refinement of AI technologies are essential to enhance their educational and clinical impact.

摘要

目的

本研究旨在评估和比较人工智能(AI)模型ChatGPT-3.5、ChatGPT-4和Gemini在土耳其专科培训与发展考试(UEGS)中的表现,以确定它们在医学教育中的效用以及改善患者护理的潜力。

方法

这项回顾性研究分析了ChatGPT-3.5、ChatGPT-4和Gemini对5年(2018 - 2023年)期间UEGS的1000道是非题的回答。涵盖9个骨科亚专业的问题由2名独立的住院医师进行分类,如有差异则由一位资深作者解决。每个查询都重新启动人工智能模型以防止数据保留。通过计算净分数并将其与从土耳其骨科学与创伤学教育委员会(TOTEK)数据库获得的骨科住院医师分数进行比较来评估表现。统计分析包括卡方检验、Bonferroni校正的Z检验、 Cochr an's Q检验和受试者工作特征(ROC)分析,以确定人工智能准确性的最佳问题长度。所有人工智能回答均独立生成,不保留先前信息。

结果

在不同年份和亚专业中观察到人工智能工具准确性存在显著差异(P <.001)。ChatGPT-4始终优于其他模型,总体准确率最高(某些亚专业中为95%)。值得注意的是,ChatGPT-4在基础和普通骨科以及足踝外科表现出色,而Gemini和ChatGPT-3.5在不同主题和年份的准确性存在差异。受试者工作特征分析显示,ChatGPT-4的字母计数越少与准确性越高之间存在显著关系(P =.002)。ChatGPT-4在所有年份中字母计数与准确性之间均显示出显著的负相关(r = 0.099,P =.002),在基础和普通骨科(P =.015)以及创伤科(P =.012)方面优于住院医师,这与其他人工智能模型不同。

结论

研究结果强调了人工智能在医学领域的不断推进作用,ChatGPT-4作为医学教育和临床决策工具具有巨大潜力。持续评估和改进人工智能技术对于增强其教育和临床影响至关重要。