Department of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah University, Al Majmaah, 11952, Saudi Arabia.
Department of Artificial Intelligence and Data Science, KPR Institute of Engineering and Technology, Coimbatore, 641407, India.
Sci Rep. 2024 Mar 27;14(1):7318. doi: 10.1038/s41598-024-57993-0.
Polyp detection is a challenging task in the diagnosis of Colorectal Cancer (CRC), and it demands clinical expertise due to the diverse nature of polyps. The recent years have witnessed the development of automated polyp detection systems to assist the experts in early diagnosis, considerably reducing the time consumption and diagnostic errors. In automated CRC diagnosis, polyp segmentation is an important step which is carried out with deep learning segmentation models. Recently, Vision Transformers (ViT) are slowly replacing these models due to their ability to capture long range dependencies among image patches. However, the existing ViTs for polyp do not harness the inherent self-attention abilities and incorporate complex attention mechanisms. This paper presents Polyp-Vision Transformer (Polyp-ViT), a novel Transformer model based on the conventional Transformer architecture, which is enhanced with adaptive mechanisms for feature extraction and positional embedding. Polyp-ViT is tested on the Kvasir-seg and CVC-Clinic DB Datasets achieving segmentation accuracies of 0.9891 ± 0.01 and 0.9875 ± 0.71 respectively, outperforming state-of-the-art models. Polyp-ViT is a prospective tool for polyp segmentation which can be adapted to other medical image segmentation tasks as well due to its ability to generalize well.
息肉检测是结直肠癌(CRC)诊断中的一项具有挑战性的任务,由于息肉的多样性,需要临床专业知识。近年来,已经开发出自动化息肉检测系统来协助专家进行早期诊断,大大减少了时间消耗和诊断错误。在自动化 CRC 诊断中,息肉分割是一个重要的步骤,它是通过深度学习分割模型来完成的。最近,由于 Vision Transformers(ViT)能够捕获图像补丁之间的长程依赖关系,它们正在慢慢取代这些模型。然而,现有的用于息肉的 ViT 并没有利用固有的自注意力能力,也没有纳入复杂的注意力机制。本文提出了基于传统 Transformer 架构的新型 Transformer 模型 Polyp-Vision Transformer(Polyp-ViT),它通过自适应机制进行特征提取和位置嵌入得到增强。Polyp-ViT 在 Kvasir-seg 和 CVC-Clinic DB 数据集上进行了测试,分别实现了 0.9891±0.01 和 0.9875±0.71 的分割精度,优于最先进的模型。Polyp-ViT 是一种有前途的息肉分割工具,由于其良好的泛化能力,也可以适应其他医学图像分割任务。