Suppr
超能文献

使用视觉Transformer和Swin Transformer对基于移动设备的口腔癌图像进行分类

Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer.

作者信息

Song Bofan, Kc Dharma Raj, Yang Rubin Yuchan, Li Shaobai, Zhang Chicheng, Liang Rongguang

机构信息

Wyant College of Optical Sciences, The University of Arizona, Tucson, AZ 85721, USA.

Computer Science Department, The University of Arizona, Tucson, AZ 85721, USA.

出版信息

Cancers (Basel). 2024 Feb 29;16(5):987. doi: 10.3390/cancers16050987.

DOI:10.3390/cancers16050987

PMID:38473348

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10931180/

Abstract

Oral cancer, a pervasive and rapidly growing malignant disease, poses a significant global health concern. Early and accurate diagnosis is pivotal for improving patient outcomes. Automatic diagnosis methods based on artificial intelligence have shown promising results in the oral cancer field, but the accuracy still needs to be improved for realistic diagnostic scenarios. Vision Transformers (ViT) have outperformed learning CNN models recently in many computer vision benchmark tasks. This study explores the effectiveness of the Vision Transformer and the Swin Transformer, two cutting-edge variants of the transformer architecture, for the mobile-based oral cancer image classification application. The pre-trained Swin transformer model achieved 88.7% accuracy in the binary classification task, outperforming the ViT model by 2.3%, while the conventional convolutional network model VGG19 and ResNet50 achieved 85.2% and 84.5% accuracy. Our experiments demonstrate that these transformer-based architectures outperform traditional convolutional neural networks in terms of oral cancer image classification, and underscore the potential of the ViT and the Swin Transformer in advancing the state of the art in oral cancer image analysis.

摘要

口腔癌是一种普遍且迅速增长的恶性疾病，是全球重大的健康问题。早期准确诊断对于改善患者预后至关重要。基于人工智能的自动诊断方法在口腔癌领域已显示出有前景的结果，但对于实际诊断场景，准确性仍需提高。视觉Transformer（ViT）最近在许多计算机视觉基准任务中表现优于卷积神经网络（CNN）模型。本研究探索了Transformer架构的两种前沿变体——视觉Transformer和Swin Transformer在基于移动设备的口腔癌图像分类应用中的有效性。预训练的Swin Transformer模型在二分类任务中达到了88.7%的准确率，比ViT模型高出2.3%，而传统卷积网络模型VGG19和ResNet50的准确率分别为85.2%和84.5%。我们的实验表明，这些基于Transformer的架构在口腔癌图像分类方面优于传统卷积神经网络，并强调了ViT和Swin Transformer在推动口腔癌图像分析技术发展方面的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6784/10931180/d1932e867de1/cancers-16-00987-g001.jpg

相似文献

Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer.

Cancers (Basel). 2024 Feb 29;16(5):987. doi: 10.3390/cancers16050987.

Swin-GA-RF: genetic algorithm-based Swin Transformer and random forest for enhancing cervical cancer classification.

Front Oncol. 2024 Jul 19;14:1392301. doi: 10.3389/fonc.2024.1392301. eCollection 2024.

Do it the transformer way: A comprehensive review of brain and vision transformers for autism spectrum disorder diagnosis and classification.

Comput Biol Med. 2023 Dec;167:107667. doi: 10.1016/j.compbiomed.2023.107667. Epub 2023 Nov 3.

Skin Cancer Segmentation and Classification Using Vision Transformer for Automatic Analysis in Dermatoscopy-Based Noninvasive Digital System.

Int J Biomed Imaging. 2024 Feb 3;2024:3022192. doi: 10.1155/2024/3022192. eCollection 2024.

RT-ViT: Real-Time Monocular Depth Estimation Using Lightweight Vision Transformers.

Sensors (Basel). 2022 May 19;22(10):3849. doi: 10.3390/s22103849.

Enhancing Melanoma Diagnosis with Advanced Deep Learning Models Focusing on Vision Transformer, Swin Transformer, and ConvNeXt.

Dermatopathology (Basel). 2024 Aug 15;11(3):239-252. doi: 10.3390/dermatopathology11030026.

ViT-PSO-SVM: Cervical Cancer Predication Based on Integrating Vision Transformer with Particle Swarm Optimization and Support Vector Machine.

Bioengineering (Basel). 2024 Jul 18;11(7):729. doi: 10.3390/bioengineering11070729.

Comparative Analysis of Vision Transformers and Conventional Convolutional Neural Networks in Detecting Referable Diabetic Retinopathy.

Ophthalmol Sci. 2024 May 17;4(6):100552. doi: 10.1016/j.xops.2024.100552. eCollection 2024 Nov-Dec.

Face-based age estimation using improved Swin Transformer with attention-based convolution.

Front Neurosci. 2023 Apr 12;17:1136934. doi: 10.3389/fnins.2023.1136934. eCollection 2023.

Seeking an optimal approach for Computer-aided Diagnosis of Pulmonary Embolism.

Med Image Anal. 2024 Jan;91:102988. doi: 10.1016/j.media.2023.102988. Epub 2023 Oct 13.

引用本文的文献

MLG: a mixed local and global model for brain tumor classification.

Front Neurosci. 2025 Jul 3;19:1618514. doi: 10.3389/fnins.2025.1618514. eCollection 2025.

Multimodal Deep Learning for Stage Classification of Head and Neck Cancer Using Masked Autoencoders and Vision Transformers with Attention-Based Fusion.

Cancers (Basel). 2025 Jun 24;17(13):2115. doi: 10.3390/cancers17132115.

Gene Swin transformer: new deep learning method for colorectal cancer prognosis using transcriptomic data.

Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf275.

Artificial intelligence and the diagnosis of oral cavity cancer and oral potentially malignant disorders from clinical photographs: a narrative review.

Front Oral Health. 2025 Mar 10;6:1569567. doi: 10.3389/froh.2025.1569567. eCollection 2025.

Identification of benign and malignant breast nodules on ultrasound: comparison of multiple deep learning models and model interpretation.

Front Oncol. 2025 Feb 18;15:1517278. doi: 10.3389/fonc.2025.1517278. eCollection 2025.

Enhancing food recognition accuracy using hybrid transformer models and image preprocessing techniques.

Sci Rep. 2025 Feb 15;15(1):5591. doi: 10.1038/s41598-025-90244-4.

Artificial Intelligence in Oral Cancer: A Comprehensive Scoping Review of Diagnostic and Prognostic Applications.

Diagnostics (Basel). 2025 Jan 24;15(3):280. doi: 10.3390/diagnostics15030280.

Integrating artificial intelligence with smartphone-based imaging for cancer detection in vivo.

Biosens Bioelectron. 2025 Mar 1;271:116982. doi: 10.1016/j.bios.2024.116982. Epub 2024 Nov 21.

Incorporating adipose tissue into a CT-based deep learning nomogram to differentiate granulomas from lung adenocarcinomas.

iScience. 2024 Aug 19;27(10):110733. doi: 10.1016/j.isci.2024.110733. eCollection 2024 Oct 18.

Artificial Intelligence in Head and Neck Cancer: Innovations, Applications, and Future Directions.

Curr Oncol. 2024 Sep 6;31(9):5255-5290. doi: 10.3390/curroncol31090389.

本文引用的文献

Hint-Based Image Colorization Based on Hierarchical Vision Transformer.

Sensors (Basel). 2022 Sep 29;22(19):7419. doi: 10.3390/s22197419.

Vision Transformers for Classification of Breast Ultrasound Images.

Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul;2022:480-483. doi: 10.1109/EMBC48229.2022.9871809.

Field validation of deep learning based Point-of-Care device for early detection of oral malignant and potentially malignant disorders.

Sci Rep. 2022 Aug 22;12(1):14283. doi: 10.1038/s41598-022-18249-x.

Vision Transformer for femur fracture classification.

Injury. 2022 Jul;53(7):2625-2634. doi: 10.1016/j.injury.2022.04.013. Epub 2022 Apr 19.

A Survey on Vision Transformer.

IEEE Trans Pattern Anal Mach Intell. 2023 Jan;45(1):87-110. doi: 10.1109/TPAMI.2022.3152247. Epub 2022 Dec 5.

AI in health and medicine.

Nat Med. 2022 Jan;28(1):31-38. doi: 10.1038/s41591-021-01614-0. Epub 2022 Jan 20.

Classification of imbalanced oral cancer image data from high-risk population.

J Biomed Opt. 2021 Oct;26(10). doi: 10.1117/1.JBO.26.10.105001.

Method for Diagnosis of Acute Lymphoblastic Leukemia Based on ViT-CNN Ensemble Model.

Comput Intell Neurosci. 2021 Aug 21;2021:7529893. doi: 10.1155/2021/7529893. eCollection 2021.

Cancer statistics for the year 2020: An overview.

Int J Cancer. 2021 Apr 5. doi: 10.1002/ijc.33588.

The impact of delayed diagnosis on the outcomes of oral cancer patients: a retrospective cohort study.

Int J Oral Maxillofac Surg. 2021 May;50(5):585-590. doi: 10.1016/j.ijom.2020.08.010. Epub 2020 Sep 8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

使用视觉Transformer和Swin Transformer对基于移动设备的口腔癌图像进行分类

Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译