Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, Heilongjiang, China.
Sergeant Schools of Army Academy of Armored Forces, Changchun, Jilin, China.
PLoS One. 2024 Apr 4;19(4):e0301019. doi: 10.1371/journal.pone.0301019. eCollection 2024.
Automatic and accurate segmentation of medical images plays an essential role in disease diagnosis and treatment planning. Convolution neural networks have achieved remarkable results in medical image segmentation in the past decade. Meanwhile, deep learning models based on Transformer architecture also succeeded tremendously in this domain. However, due to the ambiguity of the medical image boundary and the high complexity of physical organization structures, implementing effective structure extraction and accurate segmentation remains a problem requiring a solution. In this paper, we propose a novel Dual Encoder Network named DECTNet to alleviate this problem. Specifically, the DECTNet embraces four components, which are a convolution-based encoder, a Transformer-based encoder, a feature fusion decoder, and a deep supervision module. The convolutional structure encoder can extract fine spatial contextual details in images. Meanwhile, the Transformer structure encoder is designed using a hierarchical Swin Transformer architecture to model global contextual information. The novel feature fusion decoder integrates the multi-scale representation from two encoders and selects features that focus on segmentation tasks by channel attention mechanism. Further, a deep supervision module is used to accelerate the convergence of the proposed method. Extensive experiments demonstrate that, compared to the other seven models, the proposed method achieves state-of-the-art results on four segmentation tasks: skin lesion segmentation, polyp segmentation, Covid-19 lesion segmentation, and MRI cardiac segmentation.
自动且准确的医学图像分割在疾病诊断和治疗规划中起着至关重要的作用。在过去十年中,卷积神经网络在医学图像分割方面取得了显著的成果。与此同时,基于 Transformer 架构的深度学习模型在该领域也取得了巨大的成功。然而,由于医学图像边界的模糊性和组织结构的高度复杂性,实现有效的结构提取和精确的分割仍然是一个需要解决的问题。在本文中,我们提出了一种名为 DECTNet 的新型双编码器网络来缓解这个问题。具体来说,DECTNet 包含四个组件,分别是基于卷积的编码器、基于 Transformer 的编码器、特征融合解码器和深度监督模块。卷积结构编码器可以提取图像中的精细空间上下文细节。同时,基于分层 Swin Transformer 架构设计的 Transformer 结构编码器用于对全局上下文信息进行建模。新颖的特征融合解码器集成了来自两个编码器的多尺度表示,并通过通道注意力机制选择专注于分割任务的特征。此外,深度监督模块用于加速所提出方法的收敛。大量实验表明,与其他七个模型相比,所提出的方法在四个分割任务上取得了最先进的结果:皮肤病变分割、息肉分割、Covid-19 病变分割和 MRI 心脏分割。