School of Information Engineering, Qujing Normal University, Qujing, China.
Key Laboratory of Intelligent Sensor and System Design, College of Information Engineering, Qujing Normal University, Qujing, China.
Skin Res Technol. 2024 Sep;30(9):e70040. doi: 10.1111/srt.70040.
Skin cancer is one of the highly occurring diseases in human life. Early detection and treatment are the prime and necessary points to reduce the malignancy of infections. Deep learning techniques are supplementary tools to assist clinical experts in detecting and localizing skin lesions. Vision transformers (ViT) based on image segmentation classification using multiple classes provide fairly accurate detection and are gaining more popularity due to legitimate multiclass prediction capabilities.
In this research, we propose a new ViT Gradient-Weighted Class Activation Mapping (GradCAM) based architecture named ViT-GradCAM for detecting and classifying skin lesions by spreading ratio on the lesion's surface area. The proposed system is trained and validated using a HAM 10000 dataset by studying seven skin lesions. The database comprises 10 015 dermatoscopic images of varied sizes. The data preprocessing and data augmentation techniques are applied to overcome the class imbalance issues and improve the model's performance.
The proposed algorithm is based on ViT models that classify the dermatoscopic images into seven classes with an accuracy of 97.28%, precision of 98.51, recall of 95.2%, and an F1 score of 94.6, respectively. The proposed ViT-GradCAM obtains better and more accurate detection and classification than other state-of-the-art deep learning-based skin lesion detection models. The architecture of ViT-GradCAM is extensively visualized to highlight the actual pixels in essential regions associated with skin-specific pathologies.
This research proposes an alternate solution to overcome the challenges of detecting and classifying skin lesions using ViTs and GradCAM, which play a significant role in detecting and classifying skin lesions accurately rather than relying solely on deep learning models.
皮肤癌是人类生命中高发疾病之一。早期发现和治疗是降低感染恶性程度的首要和必要条件。深度学习技术是辅助临床专家检测和定位皮肤病变的补充工具。基于图像分割分类的 Vision Transformer(ViT)使用多个类别提供了相当准确的检测,并且由于具有合法的多类别预测能力而越来越受欢迎。
在这项研究中,我们提出了一种新的基于 Vision Transformer 梯度加权类激活映射(GradCAM)的架构,称为 ViT-GradCAM,用于通过在病变表面面积上的扩散比来检测和分类皮肤病变。该系统使用 HAM 10000 数据集进行训练和验证,研究了七种皮肤病变。该数据库包含 10015 张不同大小的皮肤镜图像。应用数据预处理和数据增强技术来克服类别不平衡问题并提高模型性能。
所提出的算法基于 ViT 模型,将皮肤镜图像分为七类,准确率为 97.28%,精度为 98.51%,召回率为 95.2%,F1 得分为 94.6%。与其他基于深度学习的皮肤病变检测模型相比,所提出的 ViT-GradCAM 获得了更好和更准确的检测和分类。广泛可视化了 ViT-GradCAM 的架构,以突出与皮肤特定病理相关的重要区域中的实际像素。
本研究提出了一种替代解决方案,用于使用 ViT 和 GradCAM 克服检测和分类皮肤病变的挑战,这在准确检测和分类皮肤病变方面发挥了重要作用,而不仅仅依赖于深度学习模型。