文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

Effectiveness of AI-powered Chatbots in responding to orthopaedic postgraduate exam questions-an observational study.

作者信息

Vaishya Raju, Iyengar Karthikeyan P, Patralekh Mohit Kumar, Botchu Rajesh, Shirodkar Kapil, Jain Vijay Kumar, Vaish Abhishek, Scarlat Marius M

机构信息

Department of Orthopaedics, Indraprastha Apollo Hospitals, Sarita Vihar, New Delhi, 110076, India.

Department of Orthopaedics, Southport and Ormskirk Hospital, Mersey West Lancashire Teaching NHS Trust, Southport, UK.

出版信息

Int Orthop. 2024 Aug;48(8):1963-1969. doi: 10.1007/s00264-024-06182-9. Epub 2024 Apr 15.


DOI:10.1007/s00264-024-06182-9
PMID:38619565
Abstract

PURPOSE: This study analyses the performance and proficiency of the three Artificial Intelligence (AI) generative chatbots (ChatGPT-3.5, ChatGPT-4.0, Bard Google AI®) and in answering the Multiple Choice Questions (MCQs) of postgraduate (PG) level orthopaedic qualifying examinations. METHODS: A series of 120 mock Single Best Answer' (SBA) MCQs with four possible options named A, B, C and D as answers on various musculoskeletal (MSK) conditions covering Trauma and Orthopaedic curricula were compiled. A standardised text prompt was used to generate and feed ChatGPT (both 3.5 and 4.0 versions) and Google Bard programs, which were then statistically analysed. RESULTS: Significant differences were found between responses from Chat GPT 3.5 with Chat GPT 4.0 (Chi square = 27.2, P < 0.001) and on comparing both Chat GPT 3.5 (Chi square = 63.852, P < 0.001) with Chat GPT 4.0 (Chi square = 44.246, P < 0.001) with. Bard Google AI® had 100% efficiency and was significantly more efficient than both Chat GPT 3.5 with Chat GPT 4.0 (p < 0.0001). CONCLUSION: The results demonstrate the variable potential of the different AI generative chatbots (Chat GPT 3.5, Chat GPT 4.0 and Bard Google) in their ability to answer the MCQ of PG-level orthopaedic qualifying examinations. Bard Google AI® has shown superior performance than both ChatGPT versions, underlining the potential of such large language processing models in processing and applying orthopaedic subspecialty knowledge at a PG level.

摘要

相似文献

[1]
Effectiveness of AI-powered Chatbots in responding to orthopaedic postgraduate exam questions-an observational study.

Int Orthop. 2024-8

[2]
Advancing Medical Education: Performance of Generative Artificial Intelligence Models on Otolaryngology Board Preparation Questions With Image Analysis Insights.

Cureus. 2024-7-9

[3]
Performance of artificial intelligence chatbots in sleep medicine certification board exams: ChatGPT versus Google Bard.

Eur Arch Otorhinolaryngol. 2024-4

[4]
Human versus Artificial Intelligence: ChatGPT-4 Outperforming Bing, Bard, ChatGPT-3.5 and Humans in Clinical Chemistry Multiple-Choice Questions.

Adv Med Educ Pract. 2024-9-20

[5]
Performance of ChatGPT and Bard on the official part 1 FRCOphth practice questions.

Br J Ophthalmol. 2024-9-20

[6]
Comparison of ChatGPT-3.5, ChatGPT-4, and Orthopaedic Resident Performance on Orthopaedic Assessment Examinations.

J Am Acad Orthop Surg. 2023-12-1

[7]
Evaluating Large Language Models for the National Premedical Exam in India: Comparative Analysis of GPT-3.5, GPT-4, and Bard.

JMIR Med Educ. 2024-2-21

[8]
Artificial Intelligence in Orthopaedics: Performance of ChatGPT on Text and Image Questions on a Complete AAOS Orthopaedic In-Training Examination (OITE).

J Surg Educ. 2024-11

[9]
Accuracy and Readability of Artificial Intelligence Chatbot Responses to Vasectomy-Related Questions: Public Beware.

Cureus. 2024-8-28

[10]
Comparison of artificial intelligence large language model chatbots in answering frequently asked questions in anaesthesia.

BJA Open. 2024-5-8

引用本文的文献

[1]
Comparison of hand surgery certification exams in Europe and the United States using ChatGPT 4.0.

J Hand Microsurg. 2025-5-5

[2]
Integrating artificial intelligence into orthopedics: Opportunities, challenges, and future directions.

J Hand Microsurg. 2025-4-22

[3]
Can Large Language Models Serve as Reliable Tools for Information in Dentistry? A Systematic Review.

Int Dent J. 2025-5-16

[4]
Performance of Large Language Models (ChatGPT and Gemini Advanced) in Gastrointestinal Pathology and Clinical Review of Applications in Gastroenterology.

Cureus. 2025-4-2

[5]
Accuracy of Large Language Models When Answering Clinical Research Questions: Systematic Review and Network Meta-Analysis.

J Med Internet Res. 2025-4-30

[6]
An Assessment of the Performance of Different Chatbots on Shoulder and Elbow Questions.

J Clin Med. 2025-3-27

[7]
ChatGPT-3.5 and -4.0 Do Not Reliably Create Readable Patient Education Materials for Common Orthopaedic Upper- and Lower-Extremity Conditions.

Arthrosc Sports Med Rehabil. 2024-10-10

[8]
The crucial role and challenges of medical journal editors in the modern era.

J Clin Orthop Trauma. 2025-1-10

[9]
Exploring prospects, hurdles, and road ahead for generative artificial intelligence in orthopedic education and training.

BMC Med Educ. 2024-12-28

[10]
GPT-4o’s competency in answering the simulated written European Board of Interventional Radiology exam compared to a medical student and experts in Germany and its ability to generate exam items on interventional radiology: a descriptive study.

J Educ Eval Health Prof. 2024

本文引用的文献

[1]
Evaluating Large Language Models for the National Premedical Exam in India: Comparative Analysis of GPT-3.5, GPT-4, and Bard.

JMIR Med Educ. 2024-2-21

[2]
Exploring the potential of ChatGPT in the peer review process: An observational study.

Diabetes Metab Syndr. 2024-2

[3]
Artificial intelligence, natural stupidity or artificial stupidity: who is today the winner in orthopaedics? What is true and what is fraud? What legal barriers exist for scientific writing?

Int Orthop. 2024-3

[4]
ChatGPT and large language models in orthopedics: from education and surgery to research.

J Exp Orthop. 2023-12-1

[5]
ChatGPT has entered the classroom: how LLMs could transform education.

Nature. 2023-11

[6]
Does Google's Bard Chatbot perform better than ChatGPT on the European hand surgery exam?

Int Orthop. 2024-1

[7]
Assessment of Resident and AI Chatbot Performance on the University of Toronto Family Medicine Residency Progress Test: Comparative Study.

JMIR Med Educ. 2023-9-19

[8]
Thoughts on artificial intelligence use in medical practice and in scientific writing.

Int Orthop. 2023-9

[9]
Assessing ChatGPT's ability to pass the FRCS orthopaedic part A exam: A critical analysis.

Surgeon. 2023-10

[10]
Can Artificial Intelligence Pass the American Board of Orthopaedic Surgery Examination? Orthopaedic Residents Versus ChatGPT.

Clin Orthop Relat Res. 2023-8-1

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索