Zhang Xuan, Guo Enting, Liu Xu, Zhao Hong, Yang Jie, Li Wen, Wu Wenlei, Sun Weibin
Department of Periodontics, Affiliated Hospital of Medical School, Nanjing Stomatological Hospital, Research Institute of Stomatology, Nanjing University, Nanjing, China.
Division of Computer Science, University of Aizu, Aizu, Japan.
BMC Oral Health. 2025 Jan 29;25(1):153. doi: 10.1186/s12903-025-05431-6.
The severity of furcation involvement (FI) directly affected tooth prognosis and influenced treatment approaches. However, assessing, diagnosing, and treating molars with FI was complicated by anatomical and morphological variations. Cone-beam computed tomography (CBCT) enhanced diagnostic accuracy for detecting FI and measuring furcation defects. Despite its advantages, the high cost and radiation dose associated with CBCT equipment limited its widespread use. The aim of this study was to evaluate the performance of the Vision Transformer (ViT) in comparison with several commonly used traditional deep learning (DL) models for classifying molars with or without FI on panoramic radiographs.
A total of 1,568 tooth images obtained from 506 panoramic radiographs were used to construct the database and evaluate the models. This study developed and assessed a ViT model for classifying FI from panoramic radiographs, and compared its performance with traditional models, including Multi-Layer Perceptron (MLP), Visual Geometry Group (VGG)Net, and GoogLeNet.
Among the evaluated models, the ViT model outperformed all others, achieving the highest precision (0.98), recall (0.92), and F1 score (0.95), along with the lowest cross-entropy loss (0.27) and the highest accuracy (92%). ViT also recorded the highest area under the curve (AUC) (98%), outperforming the other models with statistically significant differences (p < 0.05), confirming its enhanced classification capability. The gradient-weighted class activation mapping (Grad-CAM) analysis on the ViT model revealed the key areas of the images that the model focused on during predictions.
DL algorithms can automatically classify FI using readily accessible panoramic images. These findings demonstrate that ViT outperforms the tested traditional models, highlighting the potential of transformer-based approaches to significantly advance image classification. This approach is also expected to reduce both the radiation dose and the financial burden on patients while simultaneously improving diagnostic precision.
根分叉病变(FI)的严重程度直接影响牙齿预后并影响治疗方法。然而,由于解剖结构和形态变异,评估、诊断和治疗患有FI的磨牙较为复杂。锥形束计算机断层扫描(CBCT)提高了检测FI和测量根分叉缺损的诊断准确性。尽管CBCT有其优势,但与CBCT设备相关的高成本和辐射剂量限制了其广泛应用。本研究的目的是评估视觉Transformer(ViT)与几种常用的传统深度学习(DL)模型相比,在全景X线片上对有或无FI的磨牙进行分类的性能。
从506张全景X线片中获取的总共1568张牙齿图像用于构建数据库并评估模型。本研究开发并评估了一种用于从全景X线片中分类FI的ViT模型,并将其性能与传统模型进行比较,包括多层感知器(MLP)、视觉几何组(VGG)Net和GoogLeNet。
在评估的模型中,ViT模型表现优于所有其他模型,实现了最高的精度(0.98)、召回率(0.92)和F1分数(0.95),同时具有最低的交叉熵损失(0.27)和最高的准确率(92%)。ViT还记录了最高的曲线下面积(AUC)(98%),优于其他模型,具有统计学显著差异(p < 0.05),证实了其增强的分类能力。对ViT模型的梯度加权类激活映射(Grad-CAM)分析揭示了模型在预测过程中关注的图像关键区域。
DL算法可以使用易于获取的全景图像自动对FI进行分类。这些发现表明ViT优于测试的传统模型,突出了基于Transformer的方法在显著推进图像分类方面的潜力。这种方法还有望减少患者的辐射剂量和经济负担,同时提高诊断精度。