Yang Guang, Luo Suhuai, Greer Peter
School of Information and Physical Sciences, The University of Newcastle, Callaghan, NSW 2308, Australia.
School of Information and Physical Sciences, College of Engineering, Science and Environment, The University of Newcastle, Callaghan NSW 2308, Australia.
Sensors (Basel). 2025 Apr 15;25(8):2479. doi: 10.3390/s25082479.
Skin cancer is a significant global health concern, with melanoma being the most dangerous form, responsible for the majority of skin cancer-related deaths. Early detection of skin cancer is critical, as it can drastically improve survival rates. While deep learning models have achieved impressive results in skin cancer classification, there remain challenges in accurately distinguishing between benign and malignant lesions. In this study, we introduce a novel multi-scale attention-based performance booster inspired by the Vision Transformer (ViT) architecture, which enhances the accuracy of both ViT and convolutional neural network (CNN) models. By leveraging attention maps to identify discriminative regions within skin lesion images, our method improves the models' focus on diagnostically relevant areas. Additionally, we employ ensemble learning techniques to combine the outputs of several deep learning models using majority voting. Our skin cancer classifier, consisting of ViT and EfficientNet models, achieved a classification accuracy of 95.05% on the ISIC2018 dataset, outperforming individual models. The results demonstrate the effectiveness of integrating attention-based multi-scale learning and ensemble methods in skin cancer classification.
皮肤癌是一个重大的全球健康问题,黑色素瘤是最危险的类型,导致了大多数与皮肤癌相关的死亡。皮肤癌的早期检测至关重要,因为它可以显著提高生存率。虽然深度学习模型在皮肤癌分类方面取得了令人瞩目的成果,但在准确区分良性和恶性病变方面仍然存在挑战。在本研究中,我们引入了一种受视觉Transformer(ViT)架构启发的新型基于多尺度注意力的性能增强器,它提高了ViT和卷积神经网络(CNN)模型的准确性。通过利用注意力图来识别皮肤病变图像中的判别区域,我们的方法提高了模型对诊断相关区域的关注。此外,我们采用集成学习技术,通过多数投票来组合多个深度学习模型的输出。我们的皮肤癌分类器由ViT和EfficientNet模型组成,在ISIC2018数据集上实现了95.05%的分类准确率,优于单个模型。结果证明了在皮肤癌分类中集成基于注意力的多尺度学习和集成方法的有效性。