Bhadila Ghalia Y, Alhomied Mody, Mahmoud Abeer, Farsi Nada J
Assistant Professor, Pediatric Dentistry Department, Faculty of Dentistry, King Abdulaziz University, Jeddah, Saudi Arabia.
General Dentist, Department of Dental Public Health, Faculty of Dentistry, King Abdulaziz University, Jeddah, Saudi Arabia.
Pediatr Dent. 2025 Mar 15;47(2):73-78.
To assess the diagnostic and treatment decision-making accuracy of ChatGPT for various dental problems in pediatric patients compared to specialized pediatric dentists. This study included 12 cases, each with an average of three dental problems, resulting in a total of 36 dental problems. Successive prompts were given to ChatGPT (GPT-4), beginning with a comprehensive case presentation, followed by clinical and radiographic descriptions alongside clinical and radiographic images. Inputs for questions regarding the diagnosis and treatment were provided to the models. Accuracy was then scored based on the degree of alignment between the ChatGPT outputs and the pediatric dentistry committee decisions, which represented the control group based on their advanced training and clinical experience. ChatGPT's diagnostic accuracy was 72.2 percent, with a kappa statistic of 0.69 (95 percent confidence interval [95% CI] equals 0.6 to 0.8). In detecting dental caries, ChatGPT achieved a sensitivity of 92.3 percent and a specificity of 100 percent, with positive and negative predictive values of 100 percent and 83.3 percent, respectively. ChatGPT's treatment decision accuracy was 47.2 percent with a kappa value of 0.43 (95% CI equals 0.4 to 0.6). The difference between the accuracy of ChatGPT in diagnosis and treatment decisions was statistically significant (P=0.01). ChatGPT achieved high diagnostic accuracy but had limited capability in making treatment decisions for pediatric dental problems. ChatGPT may serve as a secondary aid in diagnosis; however, it cannot be perceived as a reliable tool for therapeutic decision-making.