Park So-Yun, Ayana Gelan, Wako Beshatu Debela, Jeong Kwangcheol Casey, Yoon Soon-Do, Choe Se-Woon
Department of IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39253, Republic of Korea.
Department of Medical IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39253, Republic of Korea.
Diagnostics (Basel). 2025 Jan 23;15(3):260. doi: 10.3390/diagnostics15030260.
Squamous cell carcinoma (SCC), a prevalent form of skin cancer, presents diagnostic challenges, particularly in resource-limited settings with a low-quality imaging infrastructure. The accurate classification of SCC margins is essential to guide effective surgical interventions and reduce recurrence rates. This study proposes a vision transformer (ViT)-based model to improve SCC margin classification by addressing the limitations of convolutional neural networks (CNNs) in analyzing low-quality histopathological images. This study introduced a transfer learning approach using a ViT architecture customized with additional flattening, batch normalization, and dense layers to enhance its capability for SCC margin classification. A performance evaluation was conducted using machine learning metrics averaged over five-fold cross-validation and comparisons were made with the leading CNN models. Ablation studies have explored the effects of architectural configuration on model performance. The ViT-based model achieved superior SCC margin classification with 0.928 ± 0.027 accuracy and 0.927 ± 0.028 AUC, surpassing the highest performing CNN model, InceptionV3 (accuracy: 0.86 ± 0.049; AUC: 0.837 ± 0.029), demonstrating robustness of ViT over CNN for low-quality histopathological images. Ablation studies have reinforced the importance of tailored architectural configurations for enhancing diagnostic performance. This study underscores the transformative potential of ViTs in histopathological analysis, especially in resource-limited settings. By enhancing diagnostic accuracy and reducing dependence on high-quality imaging and specialized expertise, it presents a scalable solution for global cancer diagnostics. Future research should prioritize optimizing ViTs for such environments and broadening their clinical applications.
鳞状细胞癌(SCC)是一种常见的皮肤癌形式,带来了诊断挑战,尤其是在成像基础设施质量低下的资源有限环境中。准确分类SCC边缘对于指导有效的手术干预和降低复发率至关重要。本研究提出了一种基于视觉Transformer(ViT)的模型,通过解决卷积神经网络(CNN)在分析低质量组织病理学图像方面的局限性来改善SCC边缘分类。本研究引入了一种迁移学习方法,使用通过额外的展平、批量归一化和全连接层定制的ViT架构来增强其SCC边缘分类能力。使用在五折交叉验证中平均的机器学习指标进行了性能评估,并与领先的CNN模型进行了比较。消融研究探讨了架构配置对模型性能的影响。基于ViT的模型实现了卓越的SCC边缘分类,准确率为0.928±0.027,AUC为0.927±0.028,超过了性能最佳的CNN模型InceptionV3(准确率:0.86±0.049;AUC:0.837±0.029),证明了ViT在低质量组织病理学图像方面相对于CNN的鲁棒性。消融研究强化了定制架构配置对提高诊断性能的重要性。本研究强调了ViT在组织病理学分析中的变革潜力,尤其是在资源有限的环境中。通过提高诊断准确性并减少对高质量成像和专业知识的依赖,它为全球癌症诊断提供了一种可扩展的解决方案。未来的研究应优先针对此类环境优化ViT并拓宽其临床应用。