Nath Abhay, Roy Om, Silveri Priyanka, Patel Sanskruti
Department of Information Technology, Devang Patel Institute of Advance Technology and Research, Charotar University of Science and Technology, CHARUSAT Campus, Anand 388421, Gujarat, India.
Department of Information Technology, Smt. Kundanben Dinsha Patel Department of Information Technology, Charotar University of Science and Technology, CHARUSAT Campus, Anand 388421, Gujarat, India.
MethodsX. 2025 Jul 17;15:103519. doi: 10.1016/j.mex.2025.103519. eCollection 2025 Dec.
Oral squamous cell carcinoma (OSCC) continues to present a major worldwide healthcare problem because patients have poor survival outcomes alongside frequent disease returns. Globocan predicts that, OSCC will result in 389,846 new cases and 188,438 deaths globally during 2022 while maintaining an extremely poor 5-year survival rate at about 50%. Our method applies residual connections with Squeeze-and-Excitation blocks along with hybrid attention systems and enhanced activation functions and optimization algorithms to boost gradient movement throughout feature extraction. Compared against established conventional CNN backbones (VGG16, ResNet50, DenseNet121, and more), the proposed ConvNeXt-SE-Attn model outperformed them in all aspects of discrimination and calibration, including precision 97.88% (vs. ≤94.2%), sensitivity 96.82% (vs. ≤92.5%), specificity 95.94% (vs. ≤93.1%), F1 score 97.31% (vs. ≤93.8%), AUC 0.9644 (vs. ≤0.945), and MCC 0.9397 (vs. ≤0.910). The findings are critical to the increased feature-representation power and the robustness of classification of the architecture. The proposed architecture employs ConvNeXt backbone with SE blocks and hybrid attention to extract essential details within class boundaries which standard models usually miss. The activation through Gaussian-based GReLU incorporates Swish activation together with DropPath regularization for producing smooth gradient patterns which lead to generalizable features across imbalanced datasets. Grad-CAM enhances interpretability by showing which image sections lead to predictions in order to enable clinical decisions. The model demonstrates its capability as an effective detection method for minimal variations in oral cells which supports precise non-invasive treatment approaches for OSCC.
口腔鳞状细胞癌(OSCC)仍然是一个全球性的重大医疗问题,因为患者的生存结果较差,且疾病复发频繁。Globocan预测,2022年全球OSCC将导致389,846例新发病例和188,438例死亡,同时5年生存率极低,约为50%。我们的方法应用带有挤压激励模块的残差连接以及混合注意力系统、增强的激活函数和优化算法,以在整个特征提取过程中促进梯度移动。与已有的传统卷积神经网络主干(VGG16、ResNet50、DenseNet121等)相比,所提出的ConvNeXt-SE-Attn模型在判别和校准的各个方面都优于它们,包括精度97.88%(对比≤94.2%)、灵敏度96.82%(对比≤92.5%)、特异性95.94%(对比≤93.1%)、F1分数97.31%(对比≤93.8%)、AUC 0.9644(对比≤0.945)和MCC 0.9397(对比≤0.910)。这些发现对于提高架构的特征表示能力和分类稳健性至关重要。所提出的架构采用带有SE模块和混合注意力的ConvNeXt主干来提取类边界内的关键细节,而标准模型通常会忽略这些细节。基于高斯的GReLU激活将Swish激活与DropPath正则化相结合,以产生平滑的梯度模式,从而在不平衡数据集中生成可泛化的特征。Grad-CAM通过显示哪些图像部分导致预测来增强可解释性,以便做出临床决策。该模型展示了其作为一种有效检测方法的能力,可用于检测口腔细胞中的微小变化,支持对OSCC进行精确的非侵入性治疗方法。