ChatGPT-4 Omni在回答口腔放射学选择题方面的优势。

ChatGPT-4 Omni's superiority in answering multiple-choice oral radiology questions.

作者信息

Tassoker Melek

机构信息

Department of Dentomaxillofacial Radiology, Faculty of Dentistry, Necmettin Erbakan University, Baglarbasi sk, Meram, Konya, 42050, Türkiye.

出版信息

BMC Oral Health. 2025 Feb 1;25(1):173. doi: 10.1186/s12903-025-05554-w.

DOI:10.1186/s12903-025-05554-w

PMID:39893407

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11786404/

Abstract

OBJECTIVES

This study evaluates and compares the performance of ChatGPT-3.5, ChatGPT-4 Omni (4o), Google Bard, and Microsoft Copilot in responding to text-based multiple-choice questions related to oral radiology, as featured in the Dental Specialty Admission Exam conducted in Türkiye.

MATERIALS AND METHODS

A collection of text-based multiple-choice questions was sourced from the open-access question bank of the Turkish Dental Specialty Admission Exam, covering the years 2012 to 2021. The study included 123 questions, each with five options and one correct answer. The accuracy levels of ChatGPT-3.5, ChatGPT-4o, Google Bard, and Microsoft Copilot were compared using descriptive statistics, the Kruskal-Wallis test, Dunn's post hoc test, and Cochran's Q test.

RESULTS AND DISCUSSION

The accuracy of the responses generated by the four chatbots exhibited statistically significant differences (p = 0.000). ChatGPT-4o achieved the highest accuracy at 86.1%, followed by Google Bard at 61.8%. ChatGPT-3.5 demonstrated an accuracy rate of 43.9%, while Microsoft Copilot recorded a rate of 41.5%.

CONCLUSION

ChatGPT-4o showcases superior accuracy and advanced reasoning capabilities, positioning it as a promising educational tool. With regular updates, it has the potential to serve as a reliable source of information for both healthcare professionals and the general public.

CLINICAL TRIAL NUMBER

Not applicable.

摘要

目的

本研究评估并比较ChatGPT-3.5、ChatGPT-4 Omni（4o）、谷歌巴德和微软副驾驶在回答与口腔放射学相关的基于文本的多项选择题时的表现，这些题目来自土耳其牙科专业入学考试。

材料与方法

从土耳其牙科专业入学考试的开放获取题库中收集了一系列基于文本的多项选择题，涵盖2012年至2021年。该研究包括123道题目，每题有五个选项和一个正确答案。使用描述性统计、克鲁斯卡尔-沃利斯检验、邓恩事后检验和 Cochr an Q检验比较了ChatGPT-3.5、ChatGPT-4o、谷歌巴德和微软副驾驶的准确率。

结果与讨论

四个聊天机器人生成的回答准确率在统计学上有显著差异（p = 0.000）。ChatGPT-4o的准确率最高，为86.1%，其次是谷歌巴德，为61.8%。ChatGPT-3.5的准确率为43.9%，而微软副驾驶的准确率为41.5%。

结论

ChatGPT-4o展示了卓越的准确率和先进的推理能力，使其成为一个有前途的教育工具。随着定期更新，它有可能成为医疗专业人员和公众可靠的信息来源。

临床试验编号

不适用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db98/11786404/d1781eb97706/12903_2025_5554_Fig1_HTML.jpg

相似文献

ChatGPT-4 Omni's superiority in answering multiple-choice oral radiology questions.ChatGPT-4 Omni在回答口腔放射学选择题方面的优势。

BMC Oral Health. 2025 Feb 1;25(1):173. doi: 10.1186/s12903-025-05554-w.

Comparison of ChatGPT-4o, Google Gemini 1.5 Pro, Microsoft Copilot Pro, and Ophthalmologists in the management of uveitis and ocular inflammation: A comparative study of large language models.ChatGPT-4o、谷歌Gemini 1.5 Pro、微软Copilot Pro与眼科医生在葡萄膜炎和眼部炎症管理中的比较：大型语言模型的对比研究

J Fr Ophtalmol. 2025 Apr;48(4):104468. doi: 10.1016/j.jfo.2025.104468. Epub 2025 Mar 13.

Using large language models (ChatGPT, Copilot, PaLM, Bard, and Gemini) in Gross Anatomy course: Comparative analysis.在大体解剖学课程中使用大语言模型（ChatGPT、Copilot、PaLM、Bard和Gemini）：比较分析

Clin Anat. 2025 Mar;38(2):200-210. doi: 10.1002/ca.24244. Epub 2024 Nov 21.

Comparative accuracy of ChatGPT-4, Microsoft Copilot and Google Gemini in the Italian entrance test for healthcare sciences degrees: a cross-sectional study.ChatGPT-4、微软 Copilot 和谷歌 Gemini 在意大利医疗科学学位入学考试中的比较准确性：一项横断面研究。

BMC Med Educ. 2024 Jun 26;24(1):694. doi: 10.1186/s12909-024-05630-9.

Artificial intelligence performance in answering multiple-choice oral pathology questions: a comparative analysis.人工智能在回答口腔病理学选择题方面的表现：一项对比分析。

BMC Oral Health. 2025 Apr 15;25(1):573. doi: 10.1186/s12903-025-05926-2.

Information from digital and human sources: A comparison of chatbot and clinician responses to orthodontic questions.来自数字和人工来源的信息：聊天机器人与临床医生对正畸问题回答的比较。

Am J Orthod Dentofacial Orthop. 2025 May 6. doi: 10.1016/j.ajodo.2025.04.008.

Proficiency, Clarity, and Objectivity of Large Language Models Versus Specialists' Knowledge on COVID-19's Impacts in Pregnancy: Cross-Sectional Pilot Study.大型语言模型在新冠肺炎对妊娠影响方面的熟练度、清晰度和客观性与专家知识对比：横断面试点研究

JMIR Form Res. 2025 Feb 5;9:e56126. doi: 10.2196/56126.

Performance of ChatGPT-3.5 and ChatGPT-4o in the Japanese National Dental Examination.ChatGPT-3.5和ChatGPT-4o在日本国家牙科考试中的表现。

J Dent Educ. 2025 Apr;89(4):459-466. doi: 10.1002/jdd.13766. Epub 2024 Nov 13.

Comparative Analysis of ChatGPT-4o and Gemini Advanced Performance on Diagnostic Radiology In-Training Exams.ChatGPT-4o与Gemini在放射诊断学培训考试中的性能对比分析

Cureus. 2025 Mar 20;17(3):e80874. doi: 10.7759/cureus.80874. eCollection 2025 Mar.

How well do large language model-based chatbots perform in oral and maxillofacial radiology?基于大型语言模型的聊天机器人在口腔颌面放射学中的表现如何？

Dentomaxillofac Radiol. 2024 Sep 1;53(6):390-395. doi: 10.1093/dmfr/twae021.

引用本文的文献

Exploring ChatGPT's potential in diagnosing oral and maxillofacial pathologies: a study of 123 challenging cases.探索ChatGPT在诊断口腔颌面病理学方面的潜力：一项对123例具有挑战性病例的研究。

BMC Oral Health. 2025 Jul 17;25(1):1187. doi: 10.1186/s12903-025-06444-x.

Artificial intelligence performance in answering multiple-choice oral pathology questions: a comparative analysis.人工智能在回答口腔病理学选择题方面的表现：一项对比分析。

BMC Oral Health. 2025 Apr 15;25(1):573. doi: 10.1186/s12903-025-05926-2.

本文引用的文献

How well do large language model-based chatbots perform in oral and maxillofacial radiology?基于大型语言模型的聊天机器人在口腔颌面放射学中的表现如何？

Dentomaxillofac Radiol. 2024 Sep 1;53(6):390-395. doi: 10.1093/dmfr/twae021.

Transforming Virtual Healthcare: The Potentials of ChatGPT-4omni in Telemedicine.变革虚拟医疗：ChatGPT-4omni在远程医疗中的潜力

Cureus. 2024 May 30;16(5):e61377. doi: 10.7759/cureus.61377. eCollection 2024 May.

Decoding medical jargon: The use of AI language models (ChatGPT-4, BARD, microsoft copilot) in radiology reports.解读医学行话：人工智能语言模型（ChatGPT-4、BARD、microsoft copilot）在放射科报告中的应用。

Patient Educ Couns. 2024 Sep;126:108307. doi: 10.1016/j.pec.2024.108307. Epub 2024 May 3.

ChatGPT in radiology: A systematic review of performance, pitfalls, and future perspectives.ChatGPT 在放射学中的应用：性能、陷阱及未来展望的系统评价。

Diagn Interv Imaging. 2024 Jul-Aug;105(7-8):251-265. doi: 10.1016/j.diii.2024.04.003. Epub 2024 Apr 27.

Performance of ChatGPT in Israeli Hebrew Internal Medicine National Residency Exam.ChatGPT 在以色列希伯来语内科住院医师考试中的表现。

Isr Med Assoc J. 2024 Feb;26(2):86-88.

Performance of ChatGPT on the Brazilian Radiology and Diagnostic Imaging and Mammography Board Examinations.ChatGPT 在巴西放射学和诊断影像学及乳腺 X 线摄影委员会考试中的表现。

Radiol Artif Intell. 2024 Jan;6(1):e230103. doi: 10.1148/ryai.230103.

Chatbots and Large Language Models in Radiology: A Practical Primer for Clinical and Research Applications.放射科中的聊天机器人和大型语言模型：临床和研究应用的实用入门指南。

Radiology. 2024 Jan;310(1):e232756. doi: 10.1148/radiol.232756.

Clinical questions on advanced life support answered by artificial intelligence. A comparison between ChatGPT, Google Bard and Microsoft Copilot.人工智能回答的关于高级生命支持的临床问题。ChatGPT、谷歌巴德和微软必应助手之间的比较。

Resuscitation. 2024 Feb;195:110114. doi: 10.1016/j.resuscitation.2024.110114. Epub 2024 Jan 9.

Performance of artificial intelligence chatbots in sleep medicine certification board exams: ChatGPT versus Google Bard.人工智能聊天机器人在睡眠医学认证委员会考试中的表现：ChatGPT 与 Google Bard 对比。

Eur Arch Otorhinolaryngol. 2024 Apr;281(4):2137-2143. doi: 10.1007/s00405-023-08381-3. Epub 2023 Dec 20.

Improved Performance of ChatGPT-4 on the OKAP Examination: A Comparative Study with ChatGPT-3.5.ChatGPT-4在医师执照考试（OKAP）中的表现提升：与ChatGPT-3.5的对比研究

J Acad Ophthalmol (2017). 2023 Sep 11;15(2):e184-e187. doi: 10.1055/s-0043-1774399. eCollection 2023 Jul.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

ChatGPT-4 Omni在回答口腔放射学选择题方面的优势。

ChatGPT-4 Omni's superiority in answering multiple-choice oral radiology questions.

作者信息

机构信息

出版信息

OBJECTIVES

MATERIALS AND METHODS

RESULTS AND DISCUSSION

CONCLUSION

CLINICAL TRIAL NUMBER

目的

材料与方法

结果与讨论

结论

临床试验编号

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献