Li Yajuan, Zou Mingchi, Zhou Xiaogang, Long Xia, Liu Xue, Yao Yanfeng
Department of Ultrasound, Yongchuan Hospital of Chongqing Medical University, Chongqing, China.
Ultrason Imaging. 2025 Mar 29:1617346251319410. doi: 10.1177/01617346251319410.
Exploring the clinical significance of employing deep learning methodologies on ultrasound images for the development of an automated model to accurately identify pleomorphic adenomas and Warthin tumors in salivary glands. A retrospective study was conducted on 91 patients who underwent ultrasonography examinations between January 2016 and December 2023 and were subsequently diagnosed with pleomorphic adenoma or Warthin's tumor based on postoperative pathological findings. A total of 526 ultrasonography images were collected for analysis. Convolutional neural network (CNN) models, including ResNet18, MobileNetV3Small, and InceptionV3, were trained and validated using these images for the differentiation of pleomorphic adenoma and Warthin's tumor. Performance evaluation metrics such as receiver operating characteristic (ROC) curves, area under the curve (AUC), sensitivity, specificity, positive predictive value, and negative predictive value were utilized. Two ultrasound physicians, with varying levels of expertise, conducted independent evaluations of the ultrasound images. Subsequently, a comparative analysis was performed between the diagnostic outcomes of the ultrasound physicians and the results obtained from the best-performing model. Inter-rater agreement between routine ultrasonography interpretation by the two expert ultrasonographers and the automatic identification diagnosis of the best model in relation to pathological results was assessed using kappa tests. The deep learning models achieved favorable performance in differentiating pleomorphic adenoma from Warthin's tumor. The ResNet18, MobileNetV3Small, and InceptionV3 models exhibited diagnostic accuracies of 82.4% (AUC: 0.932), 87.0% (AUC: 0.946), and 77.8% (AUC: 0.811), respectively. Among these models, MobileNetV3Small demonstrated the highest performance. The experienced ultrasonographer achieved a diagnostic accuracy of 73.5%, with sensitivity, specificity, positive predictive value, and negative predictive value of 73.7%, 73.3%, 77.8%, and 68.8%, respectively. The less-experienced ultrasonographer achieved a diagnostic accuracy of 69.0%, with sensitivity, specificity, positive predictive value, and negative predictive value of 66.7%, 71.4%, 71.4%, and 66.7%, respectively. The kappa test revealed strong consistency between the best-performing deep learning model and postoperative pathological diagnoses (kappa value: .778, -value < .001). In contrast, the less-experienced ultrasonographer demonstrated poor consistency in image interpretations (kappa value: .380, -value < .05). The diagnostic accuracy of the best deep learning model was significantly higher than that of the ultrasonographers, and the experienced ultrasonographer exhibited higher diagnostic accuracy than the less-experienced one. This study demonstrates the promising performance of a deep learning-based method utilizing ultrasonography images for the differentiation of pleomorphic adenoma and Warthin's tumor. The approach reduces subjective errors, provides decision support for clinicians, and improves diagnostic consistency.
探索在超声图像上应用深度学习方法开发自动模型以准确识别涎腺多形性腺瘤和沃辛瘤的临床意义。对2016年1月至2023年12月期间接受超声检查并随后根据术后病理结果被诊断为多形性腺瘤或沃辛瘤的91例患者进行了回顾性研究。共收集了526张超声图像用于分析。使用这些图像对包括ResNet18、MobileNetV3Small和InceptionV3在内的卷积神经网络(CNN)模型进行训练和验证,以区分多形性腺瘤和沃辛瘤。采用了诸如受试者操作特征(ROC)曲线、曲线下面积(AUC)、敏感性、特异性、阳性预测值和阴性预测值等性能评估指标。两名专业水平不同的超声科医生对超声图像进行了独立评估。随后,对超声科医生的诊断结果与表现最佳的模型所获得的结果进行了比较分析。使用kappa检验评估了两位专家超声科医生的常规超声检查解读与最佳模型的自动识别诊断相对于病理结果的评分者间一致性。深度学习模型在区分多形性腺瘤和沃辛瘤方面表现出良好性能。ResNet18、MobileNetV3Small和InceptionV3模型的诊断准确率分别为82.4%(AUC:0.932)、87.0%(AUC:0.946)和77.8%(AUC:0.811)。在这些模型中,MobileNetV3Small表现最佳。经验丰富的超声科医生的诊断准确率为73.5%,敏感性、特异性、阳性预测值和阴性预测值分别为73.7%、73.3%、77.8%和68.8%。经验较少的超声科医生的诊断准确率为69.0%,敏感性、特异性、阳性预测值和阴性预测值分别为66.7%、71.4%、71.4%和66.7%。kappa检验显示表现最佳的深度学习模型与术后病理诊断之间具有高度一致性(kappa值:.778,P值 < .001)。相比之下,经验较少的超声科医生在图像解读方面表现出较差的一致性(kappa值:.380,P值 < .05)。最佳深度学习模型的诊断准确率显著高于超声科医生,且经验丰富的超声科医生的诊断准确率高于经验较少的超声科医生。本研究证明了基于深度学习的方法利用超声图像区分多形性腺瘤和沃辛瘤具有良好的性能。该方法减少了主观误差,为临床医生提供了决策支持,并提高了诊断一致性。