土耳其牙科专业考试中人工智能系统回答口腔修复学问题的比较

Comparison of artificial intelligence systems in answering prosthodontics questions from the dental specialty exam in Turkey.

作者信息

Tosun Busra, Yilmaz Zeynep Sen

机构信息

Department of Prosthodontics, Faculty of Dentistry, Bolu Abant Izzet Baysal University, Bolu, Turkey.

Department of Prosthodontics, Faculty of Dentistry, The University of Atatürk, Erzurum, Turkey.

出版信息

J Dent Sci. 2025 Jul;20(3):1454-1459. doi: 10.1016/j.jds.2025.01.025. Epub 2025 Jan 31.

DOI:10.1016/j.jds.2025.01.025

PMID:40654425

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12254736/

Abstract

UNLABELLED

: Artificial intelligence (AI) is increasingly vital in dentistry, supporting diagnostics, treatment planning, and patient education. However, AI systems face challenges, especially in delivering accurate information within specialized dental fields. This study aimed to evaluate the performance of seven AI-based chatbots (ChatGPT-3.5, ChatGPT-4, Gemini, Gemini Advanced, Claude AI, Microsoft Copilot, and Smodin AI) in correctly answering prosthodontics questions from the Dental Specialty Exam (DUS) in Turkey.

MATERIALS AND METHODS

The dataset for this study consists of 128 multiple-choice prosthodontics questions from the DUS, a national exam administered in Turkey by the Student Selection and Placement Center (ÖSYM) between 2012 and 2021. Chatbot performance was assessed by categorizing the questions into case-based and knowledge-based.

RESULTS

ChatGPT-4 achieved the highest accuracy (75.8 %), while Gemini AI had the lowest (46.1 %). Gemini AI also had more incorrect (69) than correct answers (59). ChatGPT-4 and ChatGPT-3.5 showed significantly higher accuracy in knowledge-based questions compared to case-based ones (p < 0.05). For case-based questions, Gemini and Gemini Advanced had the lowest accuracy (36.4 %), while other chatbots averaged 45.5 %. In knowledge-based questions, ChatGPT-4 performed best (78.6 %) and Gemini AI the worst (47 %).

CONCLUSION

ChatGPT-4 excelled in knowledge-based prosthodontic questions, showing potential to enhance dental education through personalized learning and clinical reasoning support. However, its limitations in case-based scenarios highlight the need for optimization to better address complex clinical situations. These findings suggest that AI models can significantly contribute to dental education and clinical practice.

摘要

未标注

人工智能（AI）在牙科领域的重要性日益凸显，可辅助诊断、治疗计划制定以及患者教育。然而，人工智能系统面临诸多挑战，尤其是在专业牙科领域提供准确信息方面。本研究旨在评估七个基于人工智能的聊天机器人（ChatGPT-3.5、ChatGPT-4、Gemini、Gemini Advanced、Claude AI、Microsoft Copilot和Smodin AI）正确回答土耳其牙科专业考试（DUS）中修复学问题的表现。

材料与方法

本研究的数据集由128道来自DUS的修复学选择题组成，该考试是土耳其学生选拔与安置中心（ÖSYM）在2012年至2021年间组织的全国性考试。通过将问题分为基于病例和基于知识两类来评估聊天机器人的表现。

结果

ChatGPT-4的准确率最高（75.8%），而Gemini AI的准确率最低（46.1%）。Gemini AI答错的题目（69道）比答对的题目（59道）还多。与基于病例的问题相比，ChatGPT-4和ChatGPT-3.5在基于知识的问题上表现出显著更高的准确率（p < 0.05）。对于基于病例的问题，Gemini和Gemini Advanced的准确率最低（36.4%），而其他聊天机器人的平均准确率为45.5%。在基于知识的问题上，ChatGPT-4表现最佳（78.6%），Gemini AI表现最差（47%）。

结论

ChatGPT-4在基于知识的修复学问题上表现出色，显示出通过个性化学习和临床推理支持来加强牙科教育的潜力。然而其在基于病例场景中的局限性凸显了优化以更好应对复杂临床情况的必要性。这些发现表明人工智能模型可为牙科教育和临床实践做出重大贡献。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

土耳其牙科专业考试中人工智能系统回答口腔修复学问题的比较

Comparison of artificial intelligence systems in answering prosthodontics questions from the dental specialty exam in Turkey.

作者信息

机构信息

出版信息

UNLABELLED

MATERIALS AND METHODS

RESULTS

CONCLUSION

未标注

材料与方法

结果

结论

相似文献

本文引用的文献

土耳其牙科专业考试中人工智能系统回答口腔修复学问题的比较

Comparison of artificial intelligence systems in answering prosthodontics questions from the dental specialty exam in Turkey.

作者信息

机构信息

出版信息

UNLABELLED

MATERIALS AND METHODS

RESULTS

CONCLUSION

未标注

材料与方法

结果

结论

相似文献

本文引用的文献