HmsU-Net：一种基于 CNN 和 Transformer 的混合多尺度 U-Net 模型，用于医学图像分割。

HmsU-Net: A hybrid multi-scale U-net based on a CNN and transformer for medical image segmentation.

机构信息

Medical College, Guizhou University, Guizhou 550000, China; Department of Medical Imaging, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, Guizhou Provincial People's Hospital, Guizhou 550002, China.

Department of Medical Imaging, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, Guizhou Provincial People's Hospital, Guizhou 550002, China.

出版信息

Comput Biol Med. 2024 Mar;170:108013. doi: 10.1016/j.compbiomed.2024.108013. Epub 2024 Jan 22.

DOI:10.1016/j.compbiomed.2024.108013

PMID:38271837

Abstract

Accurate medical image segmentation is of great significance for subsequent diagnosis and analysis. The acquisition of multi-scale information plays an important role in segmenting regions of interest of different sizes. With the emergence of Transformers, numerous networks adopted hybrid structures incorporating Transformers and CNNs to learn multi-scale information. However, the majority of research has focused on the design and composition of CNN and Transformer structures, neglecting the inconsistencies in feature learning between Transformer and CNN. This oversight has resulted in the hybrid network's performance not being fully realized. In this work, we proposed a novel hybrid multi-scale segmentation network named HmsU-Net, which effectively fused multi-scale features. Specifically, HmsU-Net employed a parallel design incorporating both CNN and Transformer architectures. To address the inconsistency in feature learning between CNN and Transformer within the same stage, we proposed the multi-scale feature fusion module. For feature fusion across different stages, we introduced the cross-attention module. Comprehensive experiments conducted on various datasets demonstrate that our approach surpasses current state-of-the-art methods.

摘要

准确的医学图像分割对于后续的诊断和分析具有重要意义。获取多尺度信息对于分割不同大小的感兴趣区域起着重要作用。随着 Transformer 的出现，许多网络采用了融合 Transformer 和 CNN 的混合结构来学习多尺度信息。然而，大多数研究都集中在 CNN 和 Transformer 结构的设计和组成上，忽略了 Transformer 和 CNN 在特征学习上的不一致性。这种疏忽导致混合网络的性能没有得到充分发挥。在这项工作中，我们提出了一种名为 HmsU-Net 的新的混合多尺度分割网络，有效地融合了多尺度特征。具体来说，HmsU-Net 采用了并行设计，同时包含了 CNN 和 Transformer 架构。为了解决同一阶段内 CNN 和 Transformer 在特征学习上的不一致性，我们提出了多尺度特征融合模块。为了实现不同阶段之间的特征融合，我们引入了交叉注意力模块。在各种数据集上进行的综合实验表明，我们的方法优于当前的最先进方法。