Suppr超能文献

TAC-UNet:用于医学图像分割的Transformer辅助卷积神经网络。

TAC-UNet: transformer-assisted convolutional neural network for medical image segmentation.

作者信息

He Jingliu, Ma Yuqi, Yang Mingyue, Yang Wensong, Wu Chunming, Chen Shanxiong

机构信息

The College of Computer and Information Science, Southwest University, Chongqing, China.

The Department of NHC Key Laboratory of Diagnosis and Treatment on Brain Functional Diseases, the First affiliated Hospital of Chongqing Medical University, Chongqing, China.

出版信息

Quant Imaging Med Surg. 2024 Dec 5;14(12):8824-8839. doi: 10.21037/qims-24-1229. Epub 2024 Nov 5.

Abstract

BACKGROUND

Medical image segmentation is crucial for improving healthcare outcomes. Convolutional neural networks (CNNs) have been widely applied in medical image analysis; however, their inherent inductive biases limit their ability to capture global contextual information. Vision transformer (ViT) architectures address this limitation by leveraging attention mechanisms to model global relationships; however, they typically require large-scale datasets for effective training, which is challenging in the field of medical imaging due to limited data availability. This study aimed to integrate the advantages of CNN and ViT architectures to improve segmentation performance on small-scale medical image datasets.

METHODS

In this study, we established a U-shaped network architecture based on a Transformer-assisted convolutional neural network (TAC-UNet). The TAC-UNet is primarily composed of a hybrid structure integrating CNN and Transformer components. Specifically, the hybrid architecture follows a dual-path design in which the Transformer branch continuously conveys global contextual information to the CNN backbone. This allows the CNN backbone to enhance its global perception while building on the local features it extracts, thereby improving its ability to comprehend complex image structures. A channel cross-attention (CCA) module is also incorporated as a bridge between the encoder and decoder to better reconcile the semantic discrepancies between them.

RESULTS

Detailed experiments on three public datasets were conducted. Specifically, our model was trained on 30 images from the Multi-organ Nucleus Segmentation (MoNuSeg) training dataset, 85 images from the Gland Segmentation (GlaS) training dataset, and 551 images from the Computer Vision Center Colorectal Cancer-Clinic Database (CVC-ClinicDB) dataset. We evaluated the performance of our model on the corresponding test sets. Our TAC-UNet achieved the best Dice scores (80.36%, 90.70%, and 91.81% on the MoNuSeg, GlaS, and CVC-ClinicDB datasets, respectively) of all the models. Compared to other CNN-based, Transformer-based, and hybrid methods, the TAC-UNet demonstrated significantly superior segmentation performance.

CONCLUSIONS

Our TAC-UNet model showed advanced segmentation performance on small-scale medical image datasets. The detailed experimental results showed the effectiveness of the method. Our model's code is available at: https://github.com/hejlhello/TAC-UNet.

摘要

背景

医学图像分割对于改善医疗结果至关重要。卷积神经网络(CNN)已广泛应用于医学图像分析;然而,其固有的归纳偏差限制了它们捕捉全局上下文信息的能力。视觉Transformer(ViT)架构通过利用注意力机制对全局关系进行建模来解决这一限制;然而,它们通常需要大规模数据集进行有效训练,由于医学成像领域数据可用性有限,这具有挑战性。本研究旨在整合CNN和ViT架构的优势,以提高在小规模医学图像数据集上的分割性能。

方法

在本研究中,我们基于Transformer辅助卷积神经网络(TAC-UNet)建立了一种U形网络架构。TAC-UNet主要由一个整合了CNN和Transformer组件的混合结构组成。具体而言,该混合架构采用双路径设计,其中Transformer分支将全局上下文信息持续传递给CNN主干。这使得CNN主干在基于其提取的局部特征的同时增强其全局感知能力,从而提高其理解复杂图像结构的能力。还引入了一个通道交叉注意力(CCA)模块作为编码器和解码器之间的桥梁,以更好地协调它们之间的语义差异。

结果

在三个公共数据集上进行了详细实验。具体来说,我们的模型在多器官细胞核分割(MoNuSeg)训练数据集中的30幅图像、腺体分割(GlaS)训练数据集中的85幅图像以及计算机视觉中心结直肠癌临床数据库(CVC-ClinicDB)数据集中的551幅图像上进行训练。我们在相应的测试集上评估了模型的性能。我们的TAC-UNet在所有模型中取得了最佳的Dice分数(在MoNuSeg、GlaS和CVC-ClinicDB数据集上分别为80.36%、90.70%和91.81%)。与其他基于CNN、基于Transformer和混合方法相比,TAC-UNet表现出显著优越的分割性能。

结论

我们的TAC-UNet模型在小规模医学图像数据集上表现出先进的分割性能。详细的实验结果表明了该方法的有效性。我们模型的代码可在以下网址获取:https://github.com/hejlhello/TAC-UNet。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/daab/11651933/a9e233c5763b/qims-14-12-8824-f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验