Adult Restorative Dentistry, 442177 Oman Dental College , Muscat, Oman.
Restorative Dentistry, Dundee Dental Hospital and School, University of Dundee, Dundee, UK.
Diagnosis (Berl). 2024 May 3;11(3):259-265. doi: 10.1515/dx-2024-0034. eCollection 2024 Aug 1.
This study evaluates the comparative diagnostic accuracy of dental students and artificial intelligence (AI), specifically a modified ChatGPT 4, in endodontic assessments related to pulpal and apical conditions. The findings are intended to offer insights into the potential role of AI in augmenting dental education.
Involving 109 dental students divided into junior (54) and senior (55) groups, the study compared their diagnostic accuracy against ChatGPT's across seven clinical scenarios. Juniors had the American Association of Endodontists (AEE) terminology assistance, while seniors relied on prior knowledge. Accuracy was measured against a gold standard by experienced endodontists, using statistical analysis including Kruskal-Wallis and Dwass-Steel-Critchlow-Fligner tests.
ChatGPT achieved significantly higher accuracy (99.0 %) compared to seniors (79.7 %) and juniors (77.0 %). Median accuracy was 100.0 % for ChatGPT, 85.7 % for seniors, and 82.1 % for juniors. Statistical tests indicated significant differences between ChatGPT and both student groups (p<0.001), with no notable difference between the student cohorts.
The study reveals AI's capability to outperform dental students in diagnostic accuracy regarding endodontic assessments. This underscores AIs potential as a reference tool that students could utilize to enhance their understanding and diagnostic skills. Nevertheless, the potential for overreliance on AI, which may affect the development of critical analytical and decision-making abilities, necessitates a balanced integration of AI with human expertise and clinical judgement in dental education. Future research is essential to navigate the ethical and legal frameworks for incorporating AI tools such as ChatGPT into dental education and clinical practices effectively.
本研究评估了牙科学生和人工智能(AI),特别是经过修改的 ChatGPT 4,在牙髓和根尖状况相关的牙髓评估方面的比较诊断准确性。这些发现旨在为 AI 在增强牙科教育中的潜在作用提供见解。
该研究涉及 109 名牙科学生,分为初级组(54 名)和高级组(55 名),比较了他们在七个临床场景中的诊断准确性与 ChatGPT 的准确性。初级组使用美国牙髓病学会(AEE)术语辅助,而高级组则依赖于先前的知识。通过经验丰富的牙髓病专家使用统计分析(包括 Kruskal-Wallis 和 Dwass-Steel-Critchlow-Fligner 检验)对准确性进行了测量,以金标准为基准。
ChatGPT 的准确性明显高于高级组(79.7%)和初级组(77.0%),达到 99.0%。ChatGPT 的中位数准确性为 100.0%,高级组为 85.7%,初级组为 82.1%。统计检验表明 ChatGPT 与两个学生组之间存在显著差异(p<0.001),但学生组之间没有显著差异。
该研究表明,AI 在牙髓评估的诊断准确性方面能够超越牙科学生。这突显了 AI 作为学生可以用来增强他们的理解和诊断技能的参考工具的潜力。然而,过度依赖 AI 的潜力可能会影响批判性分析和决策能力的发展,因此需要在牙科教育中平衡地整合 AI 与人类专业知识和临床判断。未来的研究对于有效将像 ChatGPT 这样的 AI 工具纳入牙科教育和临床实践的伦理和法律框架至关重要。