Peng Haixin, An Xinjun, Chen Xue, Chen Zhenxiang
College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, China.
Phys Eng Sci Med. 2025 Jul 24. doi: 10.1007/s13246-025-01583-5.
Medical image segmentation is a complex and challenging task, which aims to accurately segment various structures or abnormal regions in medical images. However, obtaining accurate segmentation results is difficult because of the great uncertainty in the shape, location, and scale of the target region. To address these challenges, we propose a higher-order spatial interaction framework with dual cross global efficient attention (DGEAHorNet), which employs a neural network architecture based on recursive gate convolution to adequately extract multi-scale contextual information from images. Specifically, a Dual Cross-Attentions (DCA) is added to the skip connection that can effectively blend multi-stage encoder features and narrow the semantic gap. In the bottleneck stage, global channel spatial attention module (GCSAM) is used to extract image global information. To obtain better feature representation, we feed the output from the GCSAM into the multi-branch dense layer (SENetV2) for excitation. Furthermore, we adopt Depthwise Over-parameterized Convolutional Layer (DO-Conv) in order to replace the common convolutional layer in the input and output part of our network, then add Efficient Attention (EA) to diminish computational complexity and enhance our model's performance. For evaluating the effectiveness of our proposed DGEAHorNet, we conduct comprehensive experiments on four publicly-available datasets, and achieving 0.9320, 0.9337, 0.9312 and 0.7799 in Dice similarity coefficient on ISIC2018, ISIC2017, CVC-ClinicDB and HRF respectively. Our results show that DGEAHorNet has better performance compared with advanced methods. The code is publicly available at https://github.com/penghaixin/mymodel .
医学图像分割是一项复杂且具有挑战性的任务,其目的是在医学图像中准确分割出各种结构或异常区域。然而,由于目标区域的形状、位置和尺度存在很大的不确定性,要获得准确的分割结果并非易事。为应对这些挑战,我们提出了一种具有双交叉全局高效注意力的高阶空间交互框架(DGEAHorNet),该框架采用基于递归门卷积的神经网络架构,以充分从图像中提取多尺度上下文信息。具体而言,在跳跃连接中添加了双交叉注意力(DCA),它可以有效地融合多阶段编码器特征并缩小语义差距。在瓶颈阶段,使用全局通道空间注意力模块(GCSAM)来提取图像全局信息。为了获得更好的特征表示,我们将GCSAM的输出输入到多分支密集层(SENetV2)进行激励。此外,我们采用深度过度参数化卷积层(DO-Conv)来替换网络输入和输出部分的普通卷积层,然后添加高效注意力(EA)以降低计算复杂度并提高模型性能。为了评估我们提出的DGEAHorNet的有效性,我们在四个公开可用的数据集上进行了全面实验,在ISIC2018、ISIC2017、CVC-ClinicDB和HRF上的Dice相似系数分别达到了0.9320、0.9337、0.9312和0.7799。我们的结果表明,与先进方法相比,DGEAHorNet具有更好的性能。代码可在https://github.com/penghaixin/mymodel上公开获取。