School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400044, China.
School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400044, China.
Comput Methods Programs Biomed. 2024 Jul;252:108235. doi: 10.1016/j.cmpb.2024.108235. Epub 2024 May 18.
Computer-based biomedical image segmentation plays a crucial role in planning of assisted diagnostics and therapy. However, due to the variable size and irregular shape of the segmentation target, it is still a challenge to construct an effective medical image segmentation structure. Recently, hybrid architectures based on convolutional neural networks (CNNs) and transformers were proposed. However, most current backbones directly replace one or all convolutional layers with transformer blocks, regardless of the semantic gap between features. Thus, how to sufficiently and effectively eliminate the semantic gap as well as combine the global and local information is a critical challenge.
To address the challenge, we propose a novel structure, called BiU-Net, which integrates CNNs and transformers with a two-stage fusion strategy. In the first fusion stage, called Single-Scale Fusion (SSF) stage, the encoding layers of the CNNs and transformers are coupled, with both having the same feature map size. The SSF stage aims to reconstruct local features based on CNNs and long-range information based on transformers in each encoding block. In the second stage, Multi-Scale Fusion (MSF), BiU-Net interacts with multi-scale features from various encoding layers to eliminate the semantic gap between deep and shallow layers. Furthermore, a Context-Aware Block (CAB) is embedded in the bottleneck to reinforce multi-scale features in the decoder.
Experiments on four public datasets were conducted. On the BUSI dataset, our BiU-Net achieved 85.50 % on Dice coefficient (Dice), 76.73 % on intersection over union (IoU), and 97.23 % on accuracy (ACC). Compared to the state-of-the-art method, BiU-Net improves Dice by 1.17 %. For the Monuseg dataset, the proposed method attained the highest scores, reaching 80.27 % and 67.22 % for Dice and IoU. The BiU-Net achieves 95.33 % and 81.22 % Dice on the PH2 and DRIVE datasets.
The results of our experiments showed that BiU-Net transcends existing state-of-the-art methods on four publicly available biomedical datasets. Due to the powerful multi-scale feature extraction ability, our proposed BiU-Net is a versatile medical image segmentation framework for various types of medical images. The source code is released on (https://github.com/ZYLandy/BiU-Net).
基于计算机的生物医学图像分割在辅助诊断和治疗的规划中起着至关重要的作用。然而,由于分割目标的大小和形状不规则,因此构建有效的医学图像分割结构仍然是一个挑战。最近,提出了基于卷积神经网络(CNN)和变压器的混合架构。然而,目前大多数骨干网直接用变压器块替换一个或所有卷积层,而不考虑特征之间的语义差距。因此,如何充分有效地消除语义差距并结合全局和局部信息是一个关键挑战。
为了解决这一挑战,我们提出了一种新的结构,称为 BiU-Net,它结合了 CNN 和变压器,采用了两阶段融合策略。在第一阶段的融合,称为单尺度融合(SSF)阶段,CNN 和变压器的编码层耦合,它们具有相同的特征图大小。SSF 阶段的目的是基于 CNN 重建局部特征,基于变压器在每个编码块中重建远程信息。在第二阶段,多尺度融合(MSF),BiU-Net 与来自不同编码层的多尺度特征相互作用,以消除深层和浅层之间的语义差距。此外,在瓶颈处嵌入了上下文感知块(CAB),以增强解码器中的多尺度特征。
在四个公共数据集上进行了实验。在 BUSI 数据集上,我们的 BiU-Net 在 Dice 系数(Dice)上达到 85.50%,在交并比(IoU)上达到 76.73%,在准确性(ACC)上达到 97.23%。与最先进的方法相比,BiU-Net 的 Dice 提高了 1.17%。对于 Monuseg 数据集,提出的方法达到了最高分数,Dice 和 IoU 分别达到 80.27%和 67.22%。BiU-Net 在 PH2 和 DRIVE 数据集上的 Dice 分别达到 95.33%和 81.22%。
实验结果表明,BiU-Net 在四个公开的生物医学数据集上超越了现有的最先进方法。由于具有强大的多尺度特征提取能力,我们提出的 BiU-Net 是一种通用的医学图像分割框架,适用于各种类型的医学图像。源代码发布在(https://github.com/ZYLandy/BiU-Net)。