• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将 ChatGPT 融入骨科医学本科生教育:随机对照试验。

Integrating ChatGPT in Orthopedic Education for Medical Undergraduates: Randomized Controlled Trial.

机构信息

The First Clinical Medical College of Jinan University, The First Affiliated Hospital of Jinan University, Guangzhou, China.

Department of Joint Surgery and Sports Medicine, Zhuhai People's Hospital (Zhuhai Hospital Affiliated With Jinan University), Zhuhai, Guangdong, China.

出版信息

J Med Internet Res. 2024 Aug 20;26:e57037. doi: 10.2196/57037.

DOI:10.2196/57037
PMID:39163598
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11372336/
Abstract

BACKGROUND

ChatGPT is a natural language processing model developed by OpenAI, which can be iteratively updated and optimized to accommodate the changing and complex requirements of human verbal communication.

OBJECTIVE

The study aimed to evaluate ChatGPT's accuracy in answering orthopedics-related multiple-choice questions (MCQs) and assess its short-term effects as a learning aid through a randomized controlled trial. In addition, long-term effects on student performance in other subjects were measured using final examination results.

METHODS

We first evaluated ChatGPT's accuracy in answering MCQs pertaining to orthopedics across various question formats. Then, 129 undergraduate medical students participated in a randomized controlled study in which the ChatGPT group used ChatGPT as a learning tool, while the control group was prohibited from using artificial intelligence software to support learning. Following a 2-week intervention, the 2 groups' understanding of orthopedics was assessed by an orthopedics test, and variations in the 2 groups' performance in other disciplines were noted through a follow-up at the end of the semester.

RESULTS

ChatGPT-4.0 answered 1051 orthopedics-related MCQs with a 70.60% (742/1051) accuracy rate, including 71.8% (237/330) accuracy for A1 MCQs, 73.7% (330/448) accuracy for A2 MCQs, 70.2% (92/131) accuracy for A3/4 MCQs, and 58.5% (83/142) accuracy for case analysis MCQs. As of April 7, 2023, a total of 129 individuals participated in the experiment. However, 19 individuals withdrew from the experiment at various phases; thus, as of July 1, 2023, a total of 110 individuals accomplished the trial and completed all follow-up work. After we intervened in the learning style of the students in the short term, the ChatGPT group answered more questions correctly than the control group (ChatGPT group: mean 141.20, SD 26.68; control group: mean 130.80, SD 25.56; P=.04) in the orthopedics test, particularly on A1 (ChatGPT group: mean 46.57, SD 8.52; control group: mean 42.18, SD 9.43; P=.01), A2 (ChatGPT group: mean 60.59, SD 10.58; control group: mean 56.66, SD 9.91; P=.047), and A3/4 MCQs (ChatGPT group: mean 19.57, SD 5.48; control group: mean 16.46, SD 4.58; P=.002). At the end of the semester, we found that the ChatGPT group performed better on final examinations in surgery (ChatGPT group: mean 76.54, SD 9.79; control group: mean 72.54, SD 8.11; P=.02) and obstetrics and gynecology (ChatGPT group: mean 75.98, SD 8.94; control group: mean 72.54, SD 8.66; P=.04) than the control group.

CONCLUSIONS

ChatGPT answers orthopedics-related MCQs accurately, and students using it excel in both short-term and long-term assessments. Our findings strongly support ChatGPT's integration into medical education, enhancing contemporary instructional methods.

TRIAL REGISTRATION

Chinese Clinical Trial Registry Chictr2300071774; https://www.chictr.org.cn/hvshowproject.html ?id=225740&v=1.0.

摘要

背景

ChatGPT 是由 OpenAI 开发的自然语言处理模型,它可以进行迭代更新和优化,以适应人类言语交流不断变化和复杂的需求。

目的

本研究旨在评估 ChatGPT 回答骨科相关多项选择题(MCQ)的准确性,并通过随机对照试验评估其作为学习辅助工具的短期效果。此外,还通过期末考试成绩来衡量其对其他学科学生成绩的长期影响。

方法

我们首先评估了 ChatGPT 在回答各种问题格式的骨科 MCQ 方面的准确性。然后,129 名本科医学生参加了一项随机对照研究,其中 ChatGPT 组将 ChatGPT 用作学习工具,而对照组则禁止使用人工智能软件支持学习。经过两周的干预,通过骨科测试评估两组对骨科的理解,通过学期末的随访记录两组在其他学科表现的变化。

结果

ChatGPT-4.0 回答了 1051 个骨科相关 MCQ,准确率为 70.60%(742/1051),包括 A1 MCQ 准确率为 71.8%(237/330),A2 MCQ 准确率为 73.7%(330/448),A3/4 MCQ 准确率为 70.2%(92/131),案例分析 MCQ 准确率为 58.5%(83/142)。截至 2023 年 4 月 7 日,共有 129 人参与了实验。然而,19 人在不同阶段退出了实验;因此,截至 2023 年 7 月 1 日,共有 110 人完成了试验并完成了所有随访工作。在我们干预学生的短期学习方式后,ChatGPT 组在骨科测试中答对的问题多于对照组(ChatGPT 组:平均 141.20,SD 26.68;对照组:平均 130.80,SD 25.56;P=.04),尤其是在 A1(ChatGPT 组:平均 46.57,SD 8.52;对照组:平均 42.18,SD 9.43;P=.01)、A2(ChatGPT 组:平均 60.59,SD 10.58;对照组:平均 56.66,SD 9.91;P=.047)和 A3/4 MCQ(ChatGPT 组:平均 19.57,SD 5.48;对照组:平均 16.46,SD 4.58;P=.002)。在学期末,我们发现 ChatGPT 组在手术(ChatGPT 组:平均 76.54,SD 9.79;对照组:平均 72.54,SD 8.11;P=.02)和妇产科(ChatGPT 组:平均 75.98,SD 8.94;对照组:平均 72.54,SD 8.66;P=.04)期末考试中的表现优于对照组。

结论

ChatGPT 可以准确回答骨科相关的 MCQ,使用它的学生在短期和长期评估中都表现出色。我们的研究结果强烈支持将 ChatGPT 融入医学教育,增强当代教学方法。

试验注册

中国临床试验注册中心 Chictr2300071774;https://www.chictr.org.cn/hvshowproject.html? id = 225740&v = 1.0.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e9d/11372336/29152c1f704c/jmir_v26i1e57037_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e9d/11372336/d49e5a25bf01/jmir_v26i1e57037_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e9d/11372336/8ae1023932ce/jmir_v26i1e57037_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e9d/11372336/b1cd57a81225/jmir_v26i1e57037_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e9d/11372336/29152c1f704c/jmir_v26i1e57037_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e9d/11372336/d49e5a25bf01/jmir_v26i1e57037_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e9d/11372336/8ae1023932ce/jmir_v26i1e57037_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e9d/11372336/b1cd57a81225/jmir_v26i1e57037_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e9d/11372336/29152c1f704c/jmir_v26i1e57037_fig4.jpg

相似文献

1
Integrating ChatGPT in Orthopedic Education for Medical Undergraduates: Randomized Controlled Trial.将 ChatGPT 融入骨科医学本科生教育:随机对照试验。
J Med Internet Res. 2024 Aug 20;26:e57037. doi: 10.2196/57037.
2
Can ChatGPT generate practice question explanations for medical students, a new faculty teaching tool?ChatGPT能否为医学生生成练习题解释,成为一种新的教师教学工具?
Med Teach. 2025 Mar;47(3):560-564. doi: 10.1080/0142159X.2024.2363486. Epub 2024 Jun 20.
3
Performance of ChatGPT on Nursing Licensure Examinations in the United States and China: Cross-Sectional Study.ChatGPT 在中美护理执照考试中的表现:横断面研究。
JMIR Med Educ. 2024 Oct 3;10:e52746. doi: 10.2196/52746.
4
Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis.ChatGPT 在全球医学执照考试不同版本中的表现:系统评价和荟萃分析。
J Med Internet Res. 2024 Jul 25;26:e60807. doi: 10.2196/60807.
5
ChatGPT, Bard, and Bing Chat Are Large Language Processing Models That Answered Orthopaedic In-Training Examination Questions With Similar Accuracy to First-Year Orthopaedic Surgery Residents.ChatGPT、Bard和必应聊天是大型语言处理模型,它们回答骨科住院医师培训考试问题的准确率与骨科外科一年级住院医师相似。
Arthroscopy. 2025 Mar;41(3):557-562. doi: 10.1016/j.arthro.2024.08.023. Epub 2024 Aug 28.
6
Appraisal of ChatGPT's Aptitude for Medical Education: Comparative Analysis With Third-Year Medical Students in a Pulmonology Examination.评估 ChatGPT 在医学教育中的能力:与三年级医学生在肺病学考试中的比较分析。
JMIR Med Educ. 2024 Jul 23;10:e52818. doi: 10.2196/52818.
7
ChatGPT's performance in German OB/GYN exams - paving the way for AI-enhanced medical education and clinical practice.ChatGPT在德国妇产科考试中的表现——为人工智能强化医学教育和临床实践铺平道路。
Front Med (Lausanne). 2023 Dec 13;10:1296615. doi: 10.3389/fmed.2023.1296615. eCollection 2023.
8
Using large language models (ChatGPT, Copilot, PaLM, Bard, and Gemini) in Gross Anatomy course: Comparative analysis.在大体解剖学课程中使用大语言模型(ChatGPT、Copilot、PaLM、Bard和Gemini):比较分析
Clin Anat. 2025 Mar;38(2):200-210. doi: 10.1002/ca.24244. Epub 2024 Nov 21.
9
Assessing question characteristic influences on ChatGPT's performance and response-explanation consistency: Insights from Taiwan's Nursing Licensing Exam.评估问题特征对 ChatGPT 表现和回应解释一致性的影响:来自台湾护理执照考试的见解。
Int J Nurs Stud. 2024 May;153:104717. doi: 10.1016/j.ijnurstu.2024.104717. Epub 2024 Feb 8.
10
Claude, ChatGPT, Copilot, and Gemini performance versus students in different topics of neuroscience.克劳德、ChatGPT、Copilot和Gemini在神经科学不同主题上与学生的表现对比。
Adv Physiol Educ. 2025 Jun 1;49(2):430-437. doi: 10.1152/advan.00093.2024. Epub 2025 Jan 17.

引用本文的文献

1
Diagnostic Performance of ChatGPT-4o in Analyzing Oral Mucosal Lesions: A Comparative Study with Experts.ChatGPT-4o在分析口腔黏膜病变中的诊断性能:与专家的比较研究
Medicina (Kaunas). 2025 Jul 30;61(8):1379. doi: 10.3390/medicina61081379.
2
Effectiveness of AI-assisted medical education for Chinese undergraduate medical students: a meta-analysis.人工智能辅助医学教育对中国本科医学生的有效性:一项荟萃分析。
BMC Med Educ. 2025 Aug 27;25(1):1207. doi: 10.1186/s12909-025-07770-y.
3
Effectiveness of generative artificial intelligence-based teaching versus traditional teaching methods in medical education: a meta-analysis of randomized controlled trials.

本文引用的文献

1
Art or Artifact: Evaluating the Accuracy, Appeal, and Educational Value of AI-Generated Imagery in DALL·E 3 for Illustrating Congenital Heart Diseases.艺术还是人工制品:评估 DALL·E 3 中人工智能生成图像在阐明先天性心脏病方面的准确性、吸引力和教育价值。
J Med Syst. 2024 May 23;48(1):54. doi: 10.1007/s10916-024-02072-0.
2
The application of large language models in medicine: A scoping review.大语言模型在医学中的应用:一项范围综述。
iScience. 2024 Apr 23;27(5):109713. doi: 10.1016/j.isci.2024.109713. eCollection 2024 May 17.
3
ChatGPT as a Tool for Medical Education and Clinical Decision-Making on the Wards: Case Study.
生成式人工智能辅助教学与传统教学方法在医学教育中的有效性:一项随机对照试验的荟萃分析
BMC Med Educ. 2025 Aug 19;25(1):1175. doi: 10.1186/s12909-025-07750-2.
4
Exploring ChatGPT's Efficacy in Orthopaedic Arthroplasty Questions Compared to Adult Reconstruction Surgeons.与成人重建外科医生相比,探究ChatGPT在骨科关节置换问题方面的效能。
Arthroplast Today. 2025 Jul 14;34:101772. doi: 10.1016/j.artd.2025.101772. eCollection 2025 Aug.
5
ChatGPT as a Learning Tool for Medical Students: Results From a Randomized Controlled Trial.ChatGPT作为医学生的学习工具:一项随机对照试验的结果
Cureus. 2025 Jun 11;17(6):e85767. doi: 10.7759/cureus.85767. eCollection 2025 Jun.
6
Evaluating Large Language Models for Preoperative Patient Education in Superior Capsular Reconstruction: Comparative Study of Claude, GPT, and Gemini.评估大语言模型在肩胛下肌上囊重建术前患者教育中的应用:Claude、GPT和Gemini的比较研究
JMIR Perioper Med. 2025 Jun 12;8:e70047. doi: 10.2196/70047.
7
Application of AI-assisted multi-advisor system combined with BOPPPS teaching model in clinical pharmacy education.人工智能辅助多导师系统结合BOPPPS教学模式在临床药学教育中的应用。
BMC Med Educ. 2025 May 27;25(1):783. doi: 10.1186/s12909-025-07394-2.
8
Exploring the Application Capability of ChatGPT as an Instructor in Skills Education for Dental Medical Students: Randomized Controlled Trial.探索ChatGPT作为牙科医学生技能教育指导者的应用能力:随机对照试验。
J Med Internet Res. 2025 May 27;27:e68538. doi: 10.2196/68538.
9
Quantum leap in medical mentorship: exploring ChatGPT's transition from textbooks to terabytes.医学导师制的巨大飞跃:探索ChatGPT从教科书到海量数据的转变。
Front Med (Lausanne). 2025 Apr 28;12:1517981. doi: 10.3389/fmed.2025.1517981. eCollection 2025.
10
Delving into the Practical Applications and Pitfalls of Large Language Models in Medical Education: Narrative Review.深入探讨大语言模型在医学教育中的实际应用与陷阱:叙述性综述
Adv Med Educ Pract. 2025 Apr 18;16:625-636. doi: 10.2147/AMEP.S497020. eCollection 2025.
ChatGPT作为病房医学教育和临床决策工具:案例研究
JMIR Form Res. 2024 May 8;8:e51346. doi: 10.2196/51346.
4
Quality and Dependability of ChatGPT and DingXiangYuan Forums for Remote Orthopedic Consultations: Comparative Analysis.ChatGPT 和丁香园论坛在远程骨科咨询中的质量和可靠性:比较分析。
J Med Internet Res. 2024 Mar 14;26:e50882. doi: 10.2196/50882.
5
Exploring Generative Artificial Intelligence-Assisted Medical Education: Assessing Case-Based Learning for Medical Students.探索生成式人工智能辅助医学教育:评估面向医学生的基于案例的学习
Cureus. 2024 Jan 9;16(1):e51961. doi: 10.7759/cureus.51961. eCollection 2024 Jan.
6
Is ChatGPT 'ready' to be a learning tool for medical undergraduates and will it perform equally in different subjects? Comparative study of ChatGPT performance in tutorial and case-based learning questions in physiology and biochemistry.ChatGPT是否“准备好”成为医学本科生的学习工具,它在不同学科中的表现是否相同?ChatGPT在生理学和生物化学的辅导及基于案例的学习问题中的表现比较研究。
Med Teach. 2024 Nov;46(11):1441-1447. doi: 10.1080/0142159X.2024.2308779. Epub 2024 Jan 31.
7
The Role of Large Language Models in Medical Education: Applications and Implications.大语言模型在医学教育中的作用:应用与启示
JMIR Med Educ. 2023 Aug 14;9:e50945. doi: 10.2196/50945.
8
Sailing the Seven Seas: A Multinational Comparison of ChatGPT's Performance on Medical Licensing Examinations.航海七海:ChatGPT 在医学执照考试中的表现的跨国比较。
Ann Biomed Eng. 2024 Jun;52(6):1542-1545. doi: 10.1007/s10439-023-03338-3. Epub 2023 Aug 8.
9
Performance of ChatGPT on the Situational Judgement Test-A Professional Dilemmas-Based Examination for Doctors in the United Kingdom.ChatGPT在情景判断测试中的表现——英国针对医生的基于专业困境的考试
JMIR Med Educ. 2023 Aug 7;9:e48978. doi: 10.2196/48978.
10
The Advent of Generative Language Models in Medical Education.生成式语言模型在医学教育中的出现。
JMIR Med Educ. 2023 Jun 6;9:e48163. doi: 10.2196/48163.