NA-segformer:一种基于邻域注意力的多层次 Transformer 模型,用于结肠镜下息肉分割。
NA-segformer: A multi-level transformer model based on neighborhood attention for colonoscopic polyp segmentation.
机构信息
Hunan Engineering Research Center of Advanced Embedded Computing and Intelligent Medical Systems, Xiangnan University, Chenzhou, 423300, China.
School of Computer and Artificial Intelligence, Xiangnan University, Chenzhou, 423300, China.
出版信息
Sci Rep. 2024 Sep 28;14(1):22527. doi: 10.1038/s41598-024-74123-y.
In various countries worldwide, the incidence of colon cancer-related deaths has been on the rise in recent years. Early detection of symptoms and identification of intestinal polyps are crucial for improving the cure rate of colon cancer patients. Automated computer-aided diagnosis (CAD) has emerged as a solution to the low efficiency of traditional methods relying on manual diagnosis by physicians. Deep learning is the latest direction of CAD development and has shown promise for colonoscopic polyp segmentation. In this paper, we present a multi-level encoder-decoder architecture for polyp segmentation based on the Transformer architecture, termed NA-SegFormer. To improve the performance of existing Transformer-based segmentation algorithms for edge segmentation on colon polyps, we propose a patch merging module with a neighbor attention mechanism based on overlap patch merging. Since colon tract polyps vary greatly in size and different datasets have different sample sizes, we used a unified focal loss to solve the problem of category imbalance in colon tract polyp data. To assess the effectiveness of our proposed method, we utilized video capsule endoscopy and typical colonoscopy polyp datasets, as well as a dataset containing surgical equipment. On the datasets Kvasir-SEG, Kvasir-Instrument and KvasirCapsule-SEG, the Dice score of our proposed model reached 94.30%, 94.59% and 82.73%, with an accuracy of 98.26%, 99.02% and 81.84% respectively. The proposed method achieved inference speed with an Frame-per-second (FPS) of 125.01. The results demonstrated that our suggested model effectively segmented polyps better than several well-known and latest models. In addition, the proposed method has advantages in trade-off between inference speed and accuracy, and it will be of great significance to real-time colonoscopic polyp segmentation. The code is available at https://github.com/promisedong/NAFormer .
在世界各国,结肠癌相关死亡率近年来呈上升趋势。早期发现症状和识别肠息肉对于提高结肠癌患者的治愈率至关重要。自动化计算机辅助诊断 (CAD) 已成为解决传统方法依靠医生手动诊断效率低下的一种解决方案。深度学习是 CAD 发展的最新方向,已显示出在结肠镜息肉分割方面的应用前景。在本文中,我们提出了一种基于 Transformer 架构的用于息肉分割的多层次编码器-解码器架构,称为 NA-SegFormer。为了提高现有的基于 Transformer 的分割算法对结肠息肉边缘分割的性能,我们提出了一种基于重叠补丁合并的带有邻居注意力机制的补丁合并模块。由于结肠管腔息肉大小差异很大,并且不同数据集的样本大小不同,我们使用统一的焦点损失来解决结肠管腔息肉数据中类别不平衡的问题。为了评估我们提出的方法的有效性,我们利用了视频胶囊内窥镜和典型结肠镜息肉数据集,以及包含手术设备的数据集。在 Kvasir-SEG、Kvasir-Instrument 和 KvasirCapsule-SEG 数据集上,我们提出的模型的 Dice 分数分别达到了 94.30%、94.59%和 82.73%,准确率分别达到了 98.26%、99.02%和 81.84%。该方法的推断速度达到了 125.01 帧每秒 (FPS)。结果表明,我们提出的模型能够更好地分割息肉,优于几个知名和最新的模型。此外,该方法在推断速度和准确性之间的权衡方面具有优势,对于实时结肠镜息肉分割具有重要意义。代码可在 https://github.com/promisedong/NAFormer 获得。