Sasmal Pradipta, Kumar Panigrahi Susant, Panda Swarna Laxmi, Bhuyan M K
Department of Electrical Engineering, Indian Institute of Technology, Kharagpur, West Bengal, 721302, India.
Department of Electronics and Communication Engineering, National Institute of Technology, Rourkela, Odisha, 769008, India.
Med Biol Eng Comput. 2025 May 2. doi: 10.1007/s11517-025-03369-z.
Colorectal cancer (CRC) is one of the leading causes of death worldwide. This paper proposes an automated diagnostic technique to detect, localize, and classify polyps in colonoscopy video frames. The proposed model adopts the deep YOLOv4 model that incorporates both spatial and contextual information in the form of spatial attention and channel attention blocks, respectively for better localization of polyps. Finally, leveraging a fusion of deep and handcrafted features, the detected polyps are classified as adenoma or non-adenoma. Polyp shape and texture are essential features in discriminating polyp types. Therefore, the proposed work utilizes a pyramid histogram of oriented gradient (PHOG) and embedding features learned via triplet Siamese architecture to extract these features. The PHOG extracts local shape information from each polyp class, whereas the Siamese network extracts intra-polyp discriminating features. The individual and cross-database performances on two databases suggest the robustness of our method in polyp localization. The competitive analysis based on significant clinical parameters with current state-of-the-art methods confirms that our method can be used for automated polyp localization in both real-time and offline colonoscopic video frames. Our method provides an average precision of 0.8971 and 0.9171 and an F1 score of 0.8869 and 0.8812 for the Kvasir-SEG and SUN databases. Similarly, the proposed classification framework for the detected polyps yields a classification accuracy of 96.66% on a publicly available UCI colonoscopy video dataset. Moreover, the classification framework provides an F1 score of 96.54% that validates the potential of the proposed framework in polyp localization and classification.
结直肠癌(CRC)是全球主要的死亡原因之一。本文提出了一种自动诊断技术,用于在结肠镜检查视频帧中检测、定位和分类息肉。所提出的模型采用了深度YOLOv4模型,该模型分别以空间注意力和通道注意力块的形式整合了空间和上下文信息,以更好地定位息肉。最后,利用深度特征和手工特征的融合,将检测到的息肉分类为腺瘤或非腺瘤。息肉的形状和纹理是区分息肉类型的重要特征。因此,所提出的工作利用定向梯度金字塔直方图(PHOG)和通过三元暹罗架构学习的嵌入特征来提取这些特征。PHOG从每个息肉类别中提取局部形状信息,而暹罗网络提取息肉内部的区分特征。在两个数据库上的个体和跨数据库性能表明了我们方法在息肉定位方面的鲁棒性。与当前最先进方法基于重要临床参数的竞争性分析证实,我们的方法可用于实时和离线结肠镜检查视频帧中的自动息肉定位。我们的方法在Kvasir-SEG和SUN数据库上的平均精度分别为0.8971和0.9171,F1分数分别为0.8869和0.8812。同样,针对检测到的息肉所提出的分类框架在公开可用的UCI结肠镜检查视频数据集上的分类准确率为96.66%。此外,该分类框架的F1分数为96.54%,验证了所提出框架在息肉定位和分类方面的潜力。