Ammar Nour, Kühnisch Jan
Department of Conservative Dentistry and Periodontology, University Hospital, Ludwig-Maximilian University of Munich, Munich 80336, Germany.
Department of Pediatric Dentistry and Dental Public Health, Faculty of Dentistry, Alexandria University, Alexandria 21257, Egypt.
Jpn Dent Sci Rev. 2024 Dec;60:128-136. doi: 10.1016/j.jdsr.2024.02.001. Epub 2024 Feb 29.
The accuracy of artificial intelligence-aided (AI) caries diagnosis can vary considerably depending on numerous factors. This review aimed to assess the diagnostic accuracy of AI models for caries detection and classification on bitewing radiographs. Publications after 2010 were screened in five databases. A customized risk of bias (RoB) assessment tool was developed and applied to the 14 articles that met the inclusion criteria out of 935 references. Dataset sizes ranged from 112 to 3686 radiographs. While 86 % of the studies reported a model with an accuracy of ≥80 %, most exhibited unclear or high risk of bias. Three studies compared the model's diagnostic performance to dentists, in which the models consistently showed higher average sensitivity. Five studies were included in a bivariate diagnostic random-effects meta-analysis for overall caries detection. The diagnostic odds ratio was 55.8 (95 % CI= 28.8 - 108.3), and the summary sensitivity and specificity were 0.87 (0.76 - 0.94) and 0.89 (0.75 - 0.960), respectively. Independent meta-analyses for dentin and enamel caries detection were conducted and showed sensitivities of 0.84 (0.80 - 0.87) and 0.71 (0.66 - 0.75), respectively. Despite the promising diagnostic performance of AI models, the lack of high-quality, adequately reported, and externally validated studies highlight current challenges and future research needs.
人工智能辅助龋齿诊断的准确性会因众多因素而有很大差异。本综述旨在评估人工智能模型在咬合翼片上进行龋齿检测和分类的诊断准确性。在五个数据库中筛选了2010年以后发表的文献。开发了一种定制的偏倚风险(RoB)评估工具,并应用于935篇参考文献中符合纳入标准的14篇文章。数据集大小从112张到3686张X光片不等。虽然86%的研究报告模型准确率≥80%,但大多数研究显示偏倚风险不明确或较高。三项研究将模型的诊断性能与牙医进行了比较,其中模型始终显示出更高的平均敏感度。五项研究纳入了关于总体龋齿检测的双变量诊断随机效应荟萃分析。诊断比值比为55.8(95%置信区间=28.8 - 108.3),汇总敏感度和特异度分别为0.87(0.76 - 0.94)和0.89(0.75 - 0.960)。对牙本质和釉质龋齿检测进行了独立的荟萃分析,结果显示敏感度分别为0.84(0.80 - 0.87)和0.71(0.66 - 0.75)。尽管人工智能模型具有良好的诊断性能,但缺乏高质量、充分报告且经过外部验证的研究凸显了当前的挑战和未来的研究需求。