文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

ChatGPT's Performance on the Hand Surgery Self-Assessment Exam: A Critical Analysis.

作者信息

Han Yuri, Choudhry Hassaam S, Simon Michael E, Katt Brian M

机构信息

Rutgers Robert Wood Johnson Medical School, New Brunswick, NJ.

Rutgers New Jersey Medical School, Newark, NJ.

出版信息

J Hand Surg Glob Online. 2024 Jan 2;6(2):200-205. doi: 10.1016/j.jhsg.2023.11.014. eCollection 2024 Mar.


DOI:10.1016/j.jhsg.2023.11.014
PMID:38903839
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11185878/
Abstract

PURPOSE: To assess the performance of Chat Generative Pre-Trained Transformer (ChatGPT) when answering self-assessment exam questions in hand surgery and to compare correct results for text-only questions to those for questions that included images. METHODS: This study used 10 self-assessment exams from 2004 to 2013 provided by the American Society for Surgery of the Hand (ASSH). ChatGPT's performance on text-only questions and image-based questions was compared. The primary outcomes were ChatGPT's total score, score on text-only questions, and score on image-based questions. The secondary outcomes were the proportion of questions for which ChatGPT provided additional explanations, the length of those elaborations, and the number of questions for which ChatGPT provided answers with certainty. RESULTS: Out of 1,583 questions, ChatGPT answered 573 (36.2%) correct. ChatGPT performed better on text-only questions than image-based questions. Out of 1,127 text-only questions, ChatGPT answered 442 (39.2%) correctly. Out of the 456 image-based questions, it answered 131 (28.7%) correctly. There was no difference between the proportion of elaborations among text-only and image-based questions. Although there was no difference between the length of elaborations for questions ChatGPT got correct and incorrect, the length of elaborations provided for image-based questions were longer than those provided for text-only questions. Out of 1,441 confident answers, 548 (38.0%) were correct; out of 142 unconfident answers, 25 (17.6%) were correct. CONCLUSIONS: ChatGPT performed poorly on the ASSH self-assessment exams from 2004 to 2013. It performed better on text-only questions. Even with its highest score of 42% for the year 2012, the AI platform would not have received continuing medical education credit from ASSH or the American Board of Surgery. Even when only considering questions without images, ChatGPT's high score of 44% correct would not have "passed" the examination. CLINICAL RELEVANCE: At this time, medical professionals, trainees, and patients should use ChatGPT with caution as the program has not yet developed proficiency with hand subspecialty knowledge.

摘要
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef1a/11185878/bbc0c0e66665/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef1a/11185878/bbc0c0e66665/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef1a/11185878/bbc0c0e66665/gr1.jpg

相似文献

[1]
ChatGPT's Performance on the Hand Surgery Self-Assessment Exam: A Critical Analysis.

J Hand Surg Glob Online. 2024-1-2

[2]
ChatGPT's performance in German OB/GYN exams - paving the way for AI-enhanced medical education and clinical practice.

Front Med (Lausanne). 2023-12-13

[3]
Performance of an Artificial Intelligence Chatbot in Ophthalmic Knowledge Assessment.

JAMA Ophthalmol. 2023-6-1

[4]
Performance of ChatGPT on American Board of Surgery In-Training Examination Preparation Questions.

J Surg Res. 2024-7

[5]
How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment.

JMIR Med Educ. 2023-2-8

[6]
Assessment of ChatGPT's performance on neurology written board examination questions.

BMJ Neurol Open. 2023-11-2

[7]
Performance of ChatGPT on the Chinese Postgraduate Examination for Clinical Medicine: Survey Study.

JMIR Med Educ. 2024-2-9

[8]
ChatGPT Earns American Board Certification in Hand Surgery.

Hand Surg Rehabil. 2024-6

[9]
Comparison of Gemini Advanced and ChatGPT 4.0's Performances on the Ophthalmology Resident Ophthalmic Knowledge Assessment Program (OKAP) Examination Review Question Banks.

Cureus. 2024-9-17

[10]
Assessing question characteristic influences on ChatGPT's performance and response-explanation consistency: Insights from Taiwan's Nursing Licensing Exam.

Int J Nurs Stud. 2024-5

引用本文的文献

[1]
Comparison of hand surgery certification exams in Europe and the United States using ChatGPT 4.0.

J Hand Microsurg. 2025-5-5

[2]
Exploring the Current Applications of Artificial Intelligence in Orthopaedic Surgical Training: A Systematic Scoping Review.

Cureus. 2025-4-3

[3]
Matching Human Expertise: ChatGPT's Performance on Hand Surgery Examinations.

Hand (N Y). 2025-3-20

[4]
Evaluation of Chat Generative Pre-trained Transformer and Microsoft Copilot Performance on the American Society of Surgery of the Hand Self-Assessment Examinations.

J Hand Surg Glob Online. 2024-11-13

[5]
Evaluating the Performance of ChatGPT4.0 Versus ChatGPT3.5 on the Hand Surgery Self-Assessment Exam: A Comparative Analysis of Performance on Image-Based Questions.

Cureus. 2025-1-16

[6]
Examining the Role of Large Language Models in Orthopedics: Systematic Review.

J Med Internet Res. 2024-11-15

[7]
The Performance of a Customized Generative Pre-trained Transformer on the American Society for Surgery of the Hand Self-Assessment Examination.

Cureus. 2024-9-25

[8]
ChatGPT-4 Surpasses Residents: A Study of Artificial Intelligence Competency in Plastic Surgery In-service Examinations and Its Advancements from ChatGPT-3.5.

Plast Reconstr Surg Glob Open. 2024-9-5

[9]
ChatGPT-4 Can Help Hand Surgeons Communicate Better With Patients.

J Hand Surg Glob Online. 2024-4-6

[10]
The Performance of ChatGPT on the American Society for Surgery of the Hand Self-Assessment Examination.

Cureus. 2024-4-24

本文引用的文献

[1]
A step-by-step researcher's guide to the use of an AI-based transformer in epidemiology: an exploratory analysis of ChatGPT using the STROBE checklist for observational studies.

Z Gesundh Wiss. 2023-5-26

[2]
ChatGPT goes to the operating room: evaluating GPT-4 performance and its potential in surgical education and training in the era of large language models.

Ann Surg Treat Res. 2023-5

[3]
Performance of an Artificial Intelligence Chatbot in Ophthalmic Knowledge Assessment.

JAMA Ophthalmol. 2023-6-1

[4]
ChatGPT - Reshaping medical education and clinical management.

Pak J Med Sci. 2023

[5]
How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment.

JMIR Med Educ. 2023-2-8

[6]
American Board of Orthopaedic Surgery's Initiatives Toward Competency-Based Education.

JB JS Open Access. 2022-5-19

[7]
Consumer Use of "Dr Google": A Survey on Health Information-Seeking Behaviors and Navigational Needs.

J Med Internet Res. 2015-12-29

[8]
Lifelong Learning for the Hand Surgeon.

J Hand Surg Am. 2015-9

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索