Islam Al Mohimanul, Bhuiyan Sadia Shakiba, Mashira Mysun, Ahmed Md Rayhan, Islam Salekul, Shatabda Swakkhar
Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh.
Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh; Department of Computer Science, The University of British Columbia, Kelowna, Canada.
Comput Biol Med. 2025 Aug 27;197(Pt A):110986. doi: 10.1016/j.compbiomed.2025.110986.
Segmenting polyps in colonoscopy images is essential for the early identification and diagnosis of colorectal cancer, a significant cause of worldwide cancer deaths. Prior deep learning based models such as Attention based variation, UNet variations and Transformer-derived networks have had notable success in capturing intricate features and complex polyp shapes. However they frequently encounter challenges in pinpointing small details and enhancing the representation of features on both local and global scale. In this study, we have introduced the DeepLabv3++ model which is an enhanced version of the DeepLabV3+ architecture. It is designed to improve the precision and robustness of polyp segmentation in colonoscopy images. We have utilized EfficientNetV2S within the encoder module for refined feature extraction with reduced trainable parameters. Additionally, we integrated Multi-Scale Pyramid Pooling (MSPP) and Parallel Attention Aggregation Block (PAAB) modules, along with a redesigned decoder, into our DeepLabV3++ model. The proposed model incorporates diverse separable convolutional layers and attention mechanisms within the MSPP block, enhancing its capacity to capture multi-scale and directional features. Additionally, the redesigned decoder further transforms the extracted features from the encoder into a more meaningful segmentation map. Our model was evaluated on three public datasets (CVC-ColonDB, CVC-ClinicDB, Kvasir-SEG) achieving Dice coefficient scores of 96.20%, 96.54%, and 96.08%, respectively. The experimental analysis shows that DeepLabV3++ outperforms several state-of-the-art models in polyp segmentation tasks. Furthermore, compared to the baseline DeepLabV3+ model, our DeepLabV3++ with its MSPP module and redesigned decoder architecture, significantly reduced segmentation errors (e.g., false positives/negatives) across small, medium, and large polyps. This improvement in polyp delineation is crucial for accurate clinical decision-making in colonoscopy.
在结肠镜检查图像中分割息肉对于结直肠癌的早期识别和诊断至关重要,结直肠癌是全球癌症死亡的一个重要原因。先前基于深度学习的模型,如基于注意力的变体、U-Net变体和源自Transformer的网络,在捕捉复杂特征和复杂息肉形状方面取得了显著成功。然而,它们在精确识别小细节以及增强局部和全局尺度上的特征表示方面经常遇到挑战。在本研究中,我们引入了DeepLabv3++模型,它是DeepLabV3+架构的增强版本。其设计目的是提高结肠镜检查图像中息肉分割的精度和鲁棒性。我们在编码器模块中使用了EfficientNetV2S,以减少可训练参数的同时进行精细特征提取。此外,我们将多尺度金字塔池化(MSPP)和平行注意力聚合块(PAAB)模块,以及重新设计的解码器,集成到我们的DeepLabv3++模型中。所提出的模型在MSPP块中纳入了不同的深度可分离卷积层和注意力机制,增强了其捕捉多尺度和方向特征的能力。此外,重新设计的解码器进一步将从编码器提取的特征转换为更有意义的分割图。我们的模型在三个公共数据集(CVC-ColonDB、CVC-ClinicDB、Kvasir-SEG)上进行了评估,分别取得了96.20%、96.54%和96.08%的Dice系数分数。实验分析表明,DeepLabv3++在息肉分割任务中优于几个现有最先进的模型。此外,与基线DeepLabV3+模型相比,我们带有MSPP模块和重新设计解码器架构的DeepLabv3++在小、中、大息肉上显著减少了分割误差(例如,假阳性/阴性)。息肉描绘的这种改进对于结肠镜检查中准确的临床决策至关重要。