MultiIB-TransUNet：用于CT和超声图像分割的具有多个信息瓶颈模块的Transformer

MultiIB-TransUNet: Transformer with multiple information bottleneck blocks for CT and ultrasound image segmentation.

作者信息

Li Guangju, Jin Dehu, Yu Qi, Zheng Yuanjie, Qi Meng

机构信息

School of Information Science and Engineering, Shandong Normal University, Jinan, China.

出版信息

Med Phys. 2024 Feb;51(2):1178-1189. doi: 10.1002/mp.16662. Epub 2023 Aug 1.

DOI:10.1002/mp.16662

PMID:37528654

Abstract

BACKGROUND

Accurate medical image segmentation is crucial for disease diagnosis and surgical planning. Transformer networks offer a promising alternative for medical image segmentation as they can learn global features through self-attention mechanisms. To further enhance performance, many researchers have incorporated more Transformer layers into their models. However, this approach often results in the model parameters increasing significantly, causing a potential rise in complexity. Moreover, the datasets of medical image segmentation usually have fewer samples, which leads to the risk of overfitting of the model.

PURPOSE

This paper aims to design a medical image segmentation model that has fewer parameters and can effectively alleviate overfitting.

METHODS

We design a MultiIB-Transformer structure consisting of a single Transformer layer and multiple information bottleneck (IB) blocks. The Transformer layer is used to capture long-distance spatial relationships to extract global feature information. The IB block is used to compress noise and improve model robustness. The advantage of this structure is that it only needs one Transformer layer to achieve the state-of-the-art (SOTA) performance, significantly reducing the number of model parameters. In addition, we designed a new skip connection structure. It only needs two 1× 1 convolutions, the high-resolution feature map can effectively have both semantic and spatial information, thereby alleviating the semantic gap.

RESULTS

The proposed model is on the Breast UltraSound Images (BUSI) dataset, and the IoU and F1 evaluation indicators are 67.75 and 87.78. On the Synapse multi-organ segmentation dataset, the Param, Hausdorff Distance (HD) and Dice Similarity Cofficient (DSC) evaluation indicators are 22.30, 20.04 and 81.83.

CONCLUSIONS

Our proposed model (MultiIB-TransUNet) achieved superior results with fewer parameters compared to other models.

摘要

背景

准确的医学图像分割对于疾病诊断和手术规划至关重要。Transformer网络为医学图像分割提供了一种有前景的替代方案，因为它们可以通过自注意力机制学习全局特征。为了进一步提高性能，许多研究人员在其模型中加入了更多的Transformer层。然而，这种方法通常会导致模型参数显著增加，从而可能导致复杂度上升。此外，医学图像分割的数据集通常样本较少，这会导致模型出现过拟合的风险。

目的

本文旨在设计一种参数较少且能有效缓解过拟合的医学图像分割模型。

方法

我们设计了一种由单个Transformer层和多个信息瓶颈（IB）块组成的MultiIB-Transformer结构。Transformer层用于捕捉远距离空间关系以提取全局特征信息。IB块用于压缩噪声并提高模型鲁棒性。这种结构的优点是仅需一个Transformer层就能实现最优性能，显著减少了模型参数数量。此外，我们设计了一种新的跳跃连接结构。它仅需两个1×1卷积，高分辨率特征图就能有效地同时拥有语义和空间信息，从而缓解语义鸿沟。