SW-UNet：一种将滑动窗口变压器模块与卷积神经网络融合用于肺结节分割的U-Net。

SW-UNet: a U-Net fusing sliding window transformer block with CNN for segmentation of lung nodules.

作者信息

Ma Jiajun, Yuan Gang, Guo Chenhua, Gang Xiaoming, Zheng Minting

机构信息

Shenhua Hollysys Information Technology Co., Ltd., Beijing, China.

The First Affiliated Hospital of Dalian Medical University, Dalian, China.

出版信息

Front Med (Lausanne). 2023 Sep 28;10:1273441. doi: 10.3389/fmed.2023.1273441. eCollection 2023.

DOI:10.3389/fmed.2023.1273441

PMID:37841008

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10569032/

Abstract

Medical images are information carriers that visually reflect and record the anatomical structure of the human body, and play an important role in clinical diagnosis, teaching and research, etc. Modern medicine has become increasingly inseparable from the intelligent processing of medical images. In recent years, there have been more and more attempts to apply deep learning theory to medical image segmentation tasks, and it is imperative to explore a simple and efficient deep learning algorithm for medical image segmentation. In this paper, we investigate the segmentation of lung nodule images. We address the above-mentioned problems of medical image segmentation algorithms and conduct research on medical image fusion algorithms based on a hybrid channel-space attention mechanism and medical image segmentation algorithms with a hybrid architecture of Convolutional Neural Networks (CNN) and Visual Transformer. To the problem that medical image segmentation algorithms are difficult to capture long-range feature dependencies, this paper proposes a medical image segmentation model SW-UNet based on a hybrid CNN and Vision Transformer (ViT) framework. Self-attention mechanism and sliding window design of Visual Transformer are used to capture global feature associations and break the perceptual field limitation of convolutional operations due to inductive bias. At the same time, a widened self-attentive vector is used to streamline the number of modules and compress the model size so as to fit the characteristics of a small amount of medical data, which makes the model easy to be overfitted. Experiments on the LUNA16 lung nodule image dataset validate the algorithm and show that the proposed network can achieve efficient medical image segmentation on a lightweight scale. In addition, to validate the migratability of the model, we performed additional validation on other tumor datasets with desirable results. Our research addresses the crucial need for improved medical image segmentation algorithms. By introducing the SW-UNet model, which combines CNN and ViT, we successfully capture long-range feature dependencies and break the perceptual field limitations of traditional convolutional operations. This approach not only enhances the efficiency of medical image segmentation but also maintains model scalability and adaptability to small medical datasets. The positive outcomes on various tumor datasets emphasize the potential migratability and broad applicability of our proposed model in the field of medical image analysis.

摘要

医学图像是直观反映和记录人体解剖结构的信息载体，在临床诊断、教学和研究等方面发挥着重要作用。现代医学已越来越离不开医学图像的智能处理。近年来，将深度学习理论应用于医学图像分割任务的尝试越来越多，探索一种简单高效的医学图像分割深度学习算法势在必行。在本文中，我们研究肺结节图像的分割。我们解决了上述医学图像分割算法的问题，并基于混合通道 - 空间注意力机制对医学图像融合算法以及具有卷积神经网络（CNN）和视觉Transformer混合架构的医学图像分割算法进行了研究。针对医学图像分割算法难以捕捉长距离特征依赖关系的问题，本文提出了一种基于CNN和视觉Transformer（ViT）混合框架的医学图像分割模型SW - UNet。利用视觉Transformer的自注意力机制和滑动窗口设计来捕捉全局特征关联，并打破由于归纳偏差导致的卷积操作的感知域限制。同时，使用加宽的自注意力向量来简化模块数量并压缩模型大小，以适应少量医学数据的特点，从而使模型不易过拟合。在LUNA16肺结节图像数据集上的实验验证了该算法，结果表明所提出的网络能够在轻量级规模上实现高效的医学图像分割。此外，为了验证模型的可迁移性，我们在其他肿瘤数据集上进行了额外验证，结果理想。我们的研究满足了对改进医学图像分割算法的关键需求。通过引入结合了CNN和ViT的SW - UNet模型，我们成功捕捉了长距离特征依赖关系并打破了传统卷积操作的感知域限制。这种方法不仅提高了医学图像分割的效率，还保持了模型的可扩展性以及对小型医学数据集的适应性。在各种肿瘤数据集上的积极结果强调了我们提出的模型在医学图像分析领域潜在的可迁移性和广泛适用性。

相似文献

SW-UNet: a U-Net fusing sliding window transformer block with CNN for segmentation of lung nodules.

Front Med (Lausanne). 2023 Sep 28;10:1273441. doi: 10.3389/fmed.2023.1273441. eCollection 2023.

ETUNet:Exploring efficient transformer enhanced UNet for 3D brain tumor segmentation.

Comput Biol Med. 2024 Mar;171:108005. doi: 10.1016/j.compbiomed.2024.108005. Epub 2024 Jan 23.

ETU-Net: edge enhancement-guided U-Net with transformer for skin lesion segmentation.

Phys Med Biol. 2023 Dec 22;69(1). doi: 10.1088/1361-6560/ad13d2.

ECA-TFUnet: A U-shaped CNN-Transformer network with efficient channel attention for organ segmentation in anatomical sectional images of canines.

Math Biosci Eng. 2023 Oct 7;20(10):18650-18669. doi: 10.3934/mbe.2023827.

TPFR-Net: U-shaped model for lung nodule segmentation based on transformer pooling and dual-attention feature reorganization.

Med Biol Eng Comput. 2023 Aug;61(8):1929-1946. doi: 10.1007/s11517-023-02852-9. Epub 2023 May 27.

MSCT-UNET: multi-scale contrastive transformer within U-shaped network for medical image segmentation.

Phys Med Biol. 2023 Dec 28;69(1). doi: 10.1088/1361-6560/ad135d.

Improved UNet with Attention for Medical Image Segmentation.

Sensors (Basel). 2023 Oct 20;23(20):8589. doi: 10.3390/s23208589.

A Novel Deep Learning Model for Medical Image Segmentation with Convolutional Neural Network and Transformer.

Interdiscip Sci. 2023 Dec;15(4):663-677. doi: 10.1007/s12539-023-00585-9. Epub 2023 Sep 4.

LumVertCancNet: A novel 3D lumbar vertebral body cancellous bone location and segmentation method based on hybrid Swin-transformer.

Comput Biol Med. 2024 Mar;171:108237. doi: 10.1016/j.compbiomed.2024.108237. Epub 2024 Feb 28.

MCV-UNet: a modified convolution & transformer hybrid encoder-decoder network with multi-scale information fusion for ultrasound image semantic segmentation.

PeerJ Comput Sci. 2024 Jun 24;10:e2146. doi: 10.7717/peerj-cs.2146. eCollection 2024.

引用本文的文献

LIU-NET: lightweight Inception U-Net for efficient brain tumor segmentation from multimodal 3D MRI images.

PeerJ Comput Sci. 2025 Mar 31;11:e2787. doi: 10.7717/peerj-cs.2787. eCollection 2025.

Advances in Deep Learning for Semantic Segmentation of Low-Contrast Images: A Systematic Review of Methods, Challenges, and Future Directions.

Sensors (Basel). 2025 Mar 25;25(7):2043. doi: 10.3390/s25072043.

Landscape of 2D Deep Learning Segmentation Networks Applied to CT Scan from Lung Cancer Patients: A Systematic Review.

J Imaging Inform Med. 2025 Mar 4. doi: 10.1007/s10278-025-01458-x.

A lung nodule segmentation model based on the transformer with multiple thresholds and coordinate attention.

Sci Rep. 2024 Dec 30;14(1):31743. doi: 10.1038/s41598-024-82877-8.

EDTNet: A spatial aware attention-based transformer for the pulmonary nodule segmentation.

PLoS One. 2024 Nov 15;19(11):e0311080. doi: 10.1371/journal.pone.0311080. eCollection 2024.

Adaptive Feature Medical Segmentation Network: an adaptable deep learning paradigm for high-performance 3D brain lesion segmentation in medical imaging.

Front Neurosci. 2024 Apr 12;18:1363930. doi: 10.3389/fnins.2024.1363930. eCollection 2024.

LiDAR Dynamic Target Detection Based on Multidimensional Features.

Sensors (Basel). 2024 Feb 20;24(5):1369. doi: 10.3390/s24051369.

本文引用的文献

Hepatic vessel segmentation based on 3D swin-transformer with inductive biased multi-head self-attention.

BMC Med Imaging. 2023 Jul 8;23(1):91. doi: 10.1186/s12880-023-01045-y.

A transformer-based generative adversarial network for brain tumor segmentation.

Front Neurosci. 2022 Nov 30;16:1054948. doi: 10.3389/fnins.2022.1054948. eCollection 2022.

The Liver Tumor Segmentation Benchmark (LiTS).

Med Image Anal. 2023 Feb;84:102680. doi: 10.1016/j.media.2022.102680. Epub 2022 Nov 17.

COVID-19 CT image segmentation method based on swin transformer.

Front Physiol. 2022 Aug 22;13:981463. doi: 10.3389/fphys.2022.981463. eCollection 2022.

Swin transformer-based GAN for multi-modal medical image translation.

Front Oncol. 2022 Aug 8;12:942511. doi: 10.3389/fonc.2022.942511. eCollection 2022.

LLRHNet: Multiple Lesions Segmentation Using Local-Long Range Features.

Front Neuroinform. 2022 May 5;16:859973. doi: 10.3389/fninf.2022.859973. eCollection 2022.

Medical Image Segmentation Algorithm for Three-Dimensional Multimodal Using Deep Reinforcement Learning and Big Data Analytics.

Front Public Health. 2022 Apr 8;10:879639. doi: 10.3389/fpubh.2022.879639. eCollection 2022.

HF-UNet: Learning Hierarchically Inter-Task Relevance in Multi-Task U-Net for Accurate Prostate Segmentation in CT Images.

IEEE Trans Med Imaging. 2021 Aug;40(8):2118-2128. doi: 10.1109/TMI.2021.3072956. Epub 2021 Jul 30.

RA-UNet: A Hybrid Deep Attention-Aware Network to Extract Liver and Tumor in CT Scans.

Front Bioeng Biotechnol. 2020 Dec 23;8:605132. doi: 10.3389/fbioe.2020.605132. eCollection 2020.

Co-Learning Non-Negative Correlated and Uncorrelated Features for Multi-View Data.

IEEE Trans Neural Netw Learn Syst. 2021 Apr;32(4):1486-1496. doi: 10.1109/TNNLS.2020.2984810. Epub 2021 Apr 2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr
超能文献

SW-UNet：一种将滑动窗口变压器模块与卷积神经网络融合用于肺结节分割的U-Net。

SW-UNet: a U-Net fusing sliding window transformer block with CNN for segmentation of lung nodules.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr超能文献

SW-UNet：一种将滑动窗口变压器模块与卷积神经网络融合用于肺结节分割的U-Net。

SW-UNet: a U-Net fusing sliding window transformer block with CNN for segmentation of lung nodules.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr
超能文献