Suppr超能文献

LKDA-Net:用于3D医学图像分割的具有大内核深度卷积注意力的分层变压器。

LKDA-Net: Hierarchical transformer with large Kernel depthwise convolution attention for 3D medical image segmentation.

作者信息

Li Ming, Ma Jingang, Zhao Jing

机构信息

Graduate School, Shandong University of Traditional Chinese Medicine, Jinan, China.

School of Medical Information Engineering, Shandong University of Traditional Chinese Medicine, Jinan, China.

出版信息

PLoS One. 2025 Aug 8;20(8):e0329806. doi: 10.1371/journal.pone.0329806. eCollection 2025.

Abstract

Since Transformers have demonstrated excellent performance in the segmentation of two-dimensional medical images, recent works have also introduced them into 3D medical segmentation tasks. For example, hierarchical transformers like Swin UNETR have reintroduced several prior knowledge of convolutional networks, further enhancing the model's volumetric segmentation ability on three-dimensional medical datasets. The effectiveness of these hybrid architecture methods is largely attributed to the large number of parameters and the large receptive fields of non-local self-attention. We believe that large-kernel volumetric depthwise convolutions can obtain large receptive fields with fewer parameters. In this paper, we propose a lightweight three-dimensional convolutional network, LKDA-Net, for efficient and accurate three-dimensional volumetric segmentation. This network adopts a large-kernel depthwise convolution attention mechanism to simulate the self-attention mechanism of Transformers. Firstly, inspired by the Swin Transformer module, we investigate different-sized large-kernel convolution attention mechanisms to obtain larger global receptive fields, and replace the MLP in the Swin Transformer with the Inverted Bottleneck with Depthwise Convolutional Augmentation to reduce channel redundancy and enhance feature expression and segmentation performance. Secondly, we propose a skip connection fusion module to achieve smooth feature fusion, enabling the decoder to effectively utilize the features of the encoder. Finally, through experimental evaluations on three public datasets, namely Synapse, BTCV and ACDC, LKDA-Net outperforms existing models of various architectures in segmentation performance and has fewer parameters. Code: https://github.com/zouyunkai/LKDA-Net.

摘要

由于Transformer在二维医学图像分割中已展现出卓越性能,近期的研究工作也将其引入到三维医学分割任务中。例如,像Swin UNETR这样的分层Transformer重新引入了卷积网络的一些先验知识,进一步提升了模型在三维医学数据集上的体分割能力。这些混合架构方法的有效性很大程度上归因于大量参数和非局部自注意力的大感受野。我们认为大核体深度卷积可以用更少的参数获得大感受野。在本文中,我们提出了一种轻量级三维卷积网络LKDA-Net,用于高效准确的三维体分割。该网络采用大核深度卷积注意力机制来模拟Transformer的自注意力机制。首先,受Swin Transformer模块的启发,我们研究了不同大小的大核卷积注意力机制以获得更大的全局感受野,并用带深度卷积增强的倒置瓶颈替换Swin Transformer中的MLP,以减少通道冗余并增强特征表达和分割性能。其次,我们提出了一种跳跃连接融合模块来实现平滑的特征融合,使解码器能够有效利用编码器的特征。最后,通过在三个公共数据集(即Synapse、BTCV和ACDC)上的实验评估,LKDA-Net在分割性能上优于现有各种架构的模型,且参数更少。代码:https://github.com/zouyunkai/LKDA-Net

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验