Department of Orofacial Pain and Oral Medicine, Kyung Hee University Dental Hospital, Kyung Hee University School of Dentistry, #613 Hoegi-Dong, Dongdaemun-gu, Seoul, 02447, Korea.
Department of Computer Science, Hanyang University, Seoul, 04763, Korea.
Sci Rep. 2024 Aug 14;14(1):18865. doi: 10.1038/s41598-024-69848-9.
This study investigated the usefulness of deep learning-based automatic detection of temporomandibular joint (TMJ) effusion using magnetic resonance imaging (MRI) in patients with temporomandibular disorder and whether the diagnostic accuracy of the model improved when patients' clinical information was provided in addition to MRI images. The sagittal MR images of 2948 TMJs were collected from 1017 women and 457 men (mean age 37.19 ± 18.64 years). The TMJ effusion diagnostic performances of three convolutional neural networks (scratch, fine-tuning, and freeze schemes) were compared with those of human experts based on areas under the curve (AUCs) and diagnosis accuracies. The fine-tuning model with proton density (PD) images showed acceptable prediction performance (AUC = 0.7895), and the from-scratch (0.6193) and freeze (0.6149) models showed lower performances (p < 0.05). The fine-tuning model had excellent specificity compared to the human experts (87.25% vs. 58.17%). However, the human experts were superior in sensitivity (80.00% vs. 57.43%) (all p < 0.001). In gradient-weighted class activation mapping (Grad-CAM) visualizations, the fine-tuning scheme focused more on effusion than on other structures of the TMJ, and the sparsity was higher than that of the from-scratch scheme (82.40% vs. 49.83%, p < 0.05). The Grad-CAM visualizations agreed with the model learned through important features in the TMJ area, particularly around the articular disc. Two fine-tuning models on PD and T2-weighted images showed that the diagnostic performance did not improve compared with using PD alone (p < 0.05). Diverse AUCs were observed across each group when the patients were divided according to age (0.7083-0.8375) and sex (male:0.7576, female:0.7083). The prediction accuracy of the ensemble model was higher than that of the human experts when all the data were used (74.21% vs. 67.71%, p < 0.05). A deep neural network (DNN) was developed to process multimodal data, including MRI and patient clinical data. Analysis of four age groups with the DNN model showed that the 41-60 age group had the best performance (AUC = 0.8258). The fine-tuning model and DNN were optimal for judging TMJ effusion and may be used to prevent true negative cases and aid in human diagnostic performance. Assistive automated diagnostic methods have the potential to increase clinicians' diagnostic accuracy.
本研究旨在探讨基于深度学习的磁共振成像(MRI)自动检测颞下颌关节(TMJ)积液在颞下颌关节紊乱患者中的应用价值,以及在提供患者临床信息的情况下,模型的诊断准确性是否会提高。研究共采集了 1017 名女性和 457 名男性(平均年龄 37.19±18.64 岁)的 2948 个 TMJ 的矢状面 MRI 图像。基于曲线下面积(AUC)和诊断准确率,比较了三种卷积神经网络(scratch、微调、冻结方案)与人类专家对 TMJ 积液的诊断性能。具有质子密度(PD)图像的微调模型具有可接受的预测性能(AUC=0.7895),而从零开始(0.6193)和冻结(0.6149)模型的性能较低(p<0.05)。与人类专家相比,微调模型具有出色的特异性(87.25%对 58.17%)。然而,人类专家在敏感性方面表现更优(80.00%对 57.43%)(均 p<0.001)。在梯度加权类激活映射(Grad-CAM)可视化中,微调方案比从零开始的方案更关注积液,而稀疏度更高(82.40%对 49.83%,p<0.05)。Grad-CAM 可视化结果与 TMJ 区域重要特征学习的模型一致,尤其是关节盘周围。基于 PD 和 T2 加权图像的两个微调模型表明,与单独使用 PD 相比,诊断性能并未提高(p<0.05)。当根据年龄(0.7083-0.8375)和性别(男性:0.7576,女性:0.7083)对患者进行分组时,每个组的 AUC 值均不同。当使用所有数据时,集成模型的预测准确率高于人类专家(74.21%对 67.71%,p<0.05)。开发了一种深度神经网络(DNN)来处理多模态数据,包括 MRI 和患者临床数据。使用 DNN 模型对四个年龄组进行分析表明,41-60 岁年龄组的表现最佳(AUC=0.8258)。微调模型和 DNN 是判断 TMJ 积液的最佳选择,可用于防止真正的阴性病例,并辅助人类诊断性能。辅助自动化诊断方法有可能提高临床医生的诊断准确性。