IRCCS Istituto Ortopedico Galeazzi, Milan, Italy.
Department of Biomedical, Surgical and Dental Sciences, University of Milan, Milan, Italy.
BMC Oral Health. 2024 Feb 24;24(1):274. doi: 10.1186/s12903-024-04046-7.
The aim of this systematic review is to evaluate the diagnostic performance of Artificial Intelligence (AI) models designed for the detection of caries lesion (CL).
An electronic literature search was conducted on PubMed, Web of Science, SCOPUS, LILACS and Embase databases for retrospective, prospective and cross-sectional studies published until January 2023, using the following keywords: artificial intelligence (AI), machine learning (ML), deep learning (DL), artificial neural networks (ANN), convolutional neural networks (CNN), deep convolutional neural networks (DCNN), radiology, detection, diagnosis and dental caries (DC). The quality assessment was performed using the guidelines of QUADAS-2.
Twenty articles that met the selection criteria were evaluated. Five studies were performed on periapical radiographs, nine on bitewings, and six on orthopantomography. The number of imaging examinations included ranged from 15 to 2900. Four studies investigated ANN models, fifteen CNN models, and two DCNN models. Twelve were retrospective studies, six cross-sectional and two prospective. The following diagnostic performance was achieved in detecting CL: sensitivity from 0.44 to 0.86, specificity from 0.85 to 0.98, precision from 0.50 to 0.94, PPV (Positive Predictive Value) 0.86, NPV (Negative Predictive Value) 0.95, accuracy from 0.73 to 0.98, area under the curve (AUC) from 0.84 to 0.98, intersection over union of 0.3-0.4 and 0.78, Dice coefficient 0.66 and 0.88, F1-score from 0.64 to 0.92. According to the QUADAS-2 evaluation, most studies exhibited a low risk of bias.
AI-based models have demonstrated good diagnostic performance, potentially being an important aid in CL detection. Some limitations of these studies are related to the size and heterogeneity of the datasets. Future studies need to rely on comparable, large, and clinically meaningful datasets.
PROSPERO identifier: CRD42023470708.
本系统评价的目的是评估专为龋病(CL)检测而设计的人工智能(AI)模型的诊断性能。
在 PubMed、Web of Science、SCOPUS、LILACS 和 Embase 数据库中进行了电子文献检索,检索了截至 2023 年 1 月发表的回顾性、前瞻性和横断面研究,使用了以下关键词:人工智能(AI)、机器学习(ML)、深度学习(DL)、人工神经网络(ANN)、卷积神经网络(CNN)、深度卷积神经网络(DCNN)、放射学、检测、诊断和龋齿(DC)。使用 QUADAS-2 指南进行了质量评估。
评估了符合选择标准的 20 篇文章。五项研究在根尖片上进行,九项在咬片中进行,六项在全景片中进行。所包含的影像学检查数量从 15 到 2900 不等。四项研究探讨了 ANN 模型,十五项研究探讨了 CNN 模型,两项研究探讨了 DCNN 模型。十二项为回顾性研究,六项为横断面研究,两项为前瞻性研究。在检测 CL 时,以下诊断性能得以实现:灵敏度为 0.44 至 0.86,特异性为 0.85 至 0.98,精准度为 0.50 至 0.94,阳性预测值(PPV)为 0.86,阴性预测值(NPV)为 0.95,准确度为 0.73 至 0.98,曲线下面积(AUC)为 0.84 至 0.98,交并比为 0.3-0.4 和 0.78,Dice 系数为 0.66 和 0.88,F1 分数为 0.64 至 0.92。根据 QUADAS-2 评估,大多数研究存在低偏倚风险。
基于 AI 的模型表现出良好的诊断性能,可能是 CL 检测的重要辅助手段。这些研究的一些局限性与数据集的大小和异质性有关。未来的研究需要依赖具有可比性、大规模和具有临床意义的数据集。
PROSPERO 标识符:CRD42023470708。