具有事后可解释性的深度卷积神经网络模型用于舌部舌炎和口腔鳞状细胞癌的自动检测。

DCNN models with post-hoc interpretability for the automated detection of glossitis and OSCC on the tongue.

作者信息

Lee Yeon-Hee, Jeon Seonggwang, Jung Junho, Auh Q Schick, Lee Jae Seo, Chaurasia Akhilanand, Noh Yung Kyun

机构信息

Department of Orofacial Pain and Oral Medicine, Kyung Hee University Dental Hospital, Kyung Hee University, #26 Kyunghee-daero, Dongdaemun-gu, Seoul, 02447, South Korea.

Center for Systems Biology, Massachusetts General Hospital, 185 Cambridge Street, Boston, MA, 02114, USA.

出版信息

Sci Rep. 2025 Aug 29;15(1):31940. doi: 10.1038/s41598-025-16760-5.

DOI:10.1038/s41598-025-16760-5

PMID:40883379

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12397486/

Abstract

This study aimed to develop and evaluate deep convolutional neural network (DCNN) models with Grad-CAM visualization for the automated classification with interpretability of tongue conditions-specifically glossitis and oral squamous cell carcinoma (OSCC)-using clinical tongue photographs, with a focus on their potential for early detection and telemedicine-based diagnostics. A total of 652 tongue images were categorized into normal control (n = 294), glossitis (n = 340), and OSCC (n = 17). Four pretrained DCNN architectures (VGG16, VGG19, ResNet50, ResNet152) were fine-tuned using transfer learning. Model interpretability was enhanced via Grad-CAM and sparsity analysis. Diagnostic performance was assessed using AUROC, with subgroup analysis by age, sex, and image segmentation strategy. For glossitis classification, VGG16 (AUROC = 0.8428, 95% CI 0.7757-0.9100) and VGG19 (AUROC = 0.8639, 95% CI 0.7988-0.9170) performed strongly, while the ensemble of VGG16 and VGG19 achieved the best result (AUROC = 0.8731, 95% CI 0.8072-0.9298). OSCC detection showed near-perfect performance across all models, with VGG19 and ResNet152 achieving AUROC = 1.0000 and VGG16 reaching AUROC = 0.9902 (95% CI 0.9707-1.0000). Diagnostic performance did not differ significantly by age (P = 0.3052) or sex (P = 0.4531), and whole-image classification outperformed patch-wise segmentation (P = 0.7440). DCNN models with Grad-CAM demonstrated robust performance in classifying glossitis and OSCC from tongue photographs with interpretability. The results highlight the potential of AI-driven tongue diagnosis as a valuable tool for remote healthcare, promoting early detection and expanding access to oral health services.

摘要

本研究旨在开发并评估具有Grad-CAM可视化功能的深度卷积神经网络（DCNN）模型，以便利用临床舌部照片对舌部特定病症——特别是舌炎和口腔鳞状细胞癌（OSCC）进行可解释性的自动分类，重点关注其在早期检测和基于远程医疗的诊断方面的潜力。总共652张舌部图像被分类为正常对照（n = 294）、舌炎（n = 340）和OSCC（n = 17）。使用迁移学习对四种预训练的DCNN架构（VGG16、VGG19、ResNet50、ResNet152）进行微调。通过Grad-CAM和稀疏性分析增强模型的可解释性。使用受试者工作特征曲线下面积（AUROC）评估诊断性能，并按年龄、性别和图像分割策略进行亚组分析。对于舌炎分类，VGG16（AUROC = 0.8428，95%置信区间0.7757 - 0.9100）和VGG19（AUROC = 0.8639，95%置信区间0.7988 - 0.9170）表现出色，而VGG16和VGG19的集成模型取得了最佳结果（AUROC = 0.8731，95%置信区间0.8072 - 0.9298）。OSCC检测在所有模型中均表现出近乎完美的性能，VGG19和ResNet152的AUROC = 1.0000，VGG16的AUROC = 0.9902（95%置信区间0.9707 - 1.0000）。诊断性能在年龄（P = 0.3052）或性别（P = 0.4531）方面无显著差异，全图像分类优于逐块分割（P = 0.7440）。具有Grad-CAM的DCNN模型在从舌部照片中对舌炎和OSCC进行可解释性分类方面表现出强大的性能。结果突出了人工智能驱动的舌部诊断作为远程医疗有价值工具的潜力，促进早期检测并扩大口腔健康服务的可及性。