MMViT-Seg：一种用于 COVID-19 分割的轻量级Transformer 和 CNN 融合网络。

MMViT-Seg: A lightweight transformer and CNN fusion network for COVID-19 segmentation.

机构信息

Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Medicine and Engineering, No.37 Xueyuan Road, Haidian District, Beijing, China; Key Laboratory of Big Data-Based Precision Medicine, Ministry of Industry and Information Technology, No.37 Xueyuan Road, Haidian District, Beijing, China; School of Automation Science and Electrical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, China.

出版信息

Comput Methods Programs Biomed. 2023 Mar;230:107348. doi: 10.1016/j.cmpb.2023.107348. Epub 2023 Jan 12.

DOI:10.1016/j.cmpb.2023.107348

PMID:36706618

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9833855/

Abstract

BACKGROUND AND OBJECTIVE

COVID-19 is a serious threat to human health. Traditional convolutional neural networks (CNNs) can realize medical image segmentation, whilst transformers can be used to perform machine vision tasks, because they have a better ability to capture long-range relationships than CNNs. The combination of CNN and transformers to complete the task of semantic segmentation has attracted intense research. Currently, it is challenging to segment medical images on limited data sets like that on COVID-19.

METHODS

This study proposes a lightweight transformer+CNN model, in which the encoder sub-network is a two-path design that enables both the global dependence of image features and the low layer spatial details to be effectively captured. Using CNN and MobileViT to jointly extract image features reduces the amount of computation and complexity of the model as well as improves the segmentation performance. So this model is titled Mini-MobileViT-Seg (MMViT-Seg). In addition, a multi query attention (MQA) module is proposed to fuse the multi-scale features from different levels of decoder sub-network, further improving the performance of the model. MQA can simultaneously fuse multi-input, multi-scale low-level feature maps and high-level feature maps as well as conduct end-to-end supervised learning guided by ground truth.

RESULTS

The two-class infection labeling experiments were conducted based on three datasets. The final results show that the proposed model has the best performance and the minimum number of parameters among five popular semantic segmentation algorithms. In multi-class infection labeling results, the proposed model also achieved competitive performance.

CONCLUSIONS

The proposed MMViT-Seg is tested on three COVID-19 segmentation datasets, with results showing that this model has better performance than other models. In addition, the proposed MQA module, which can effectively fuse multi-scale features of different levels further improves the segmentation accuracy.

摘要

背景与目的

COVID-19 对人类健康构成严重威胁。传统的卷积神经网络（CNN）可以实现医学图像分割，而变压器可以用于执行机器视觉任务，因为它们比 CNN 具有更好的捕获长程关系的能力。将 CNN 和变压器结合起来完成语义分割任务引起了人们的浓厚兴趣。目前，在 COVID-19 等有限的数据集上分割医学图像具有挑战性。

方法

本研究提出了一种轻量级的变压器+CNN 模型，其中编码器子网络采用双路径设计，能够有效地捕获图像特征的全局依赖性和低层次空间细节。使用 CNN 和 MobileViT 联合提取图像特征可以减少计算量和模型的复杂性，并提高分割性能。因此，该模型被命名为 Mini-MobileViT-Seg（MMViT-Seg）。此外，提出了一种多查询注意力（MQA）模块，用于融合来自解码器子网络不同层次的多尺度特征，进一步提高模型的性能。MQA 可以同时融合多输入、多尺度的低层次特征图和高层次特征图，并在地面实况的指导下进行端到端的监督学习。

结果

基于三个数据集进行了两类感染标记实验。最终结果表明，在所提出的模型中，五个流行的语义分割算法具有最佳的性能和最小的参数数量。在多类感染标记结果中，所提出的模型也取得了有竞争力的性能。

结论

在所提出的 MMViT-Seg 上进行了三个 COVID-19 分割数据集的测试，结果表明该模型的性能优于其他模型。此外，所提出的 MQA 模块可以有效地融合不同层次的多尺度特征，进一步提高了分割精度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b83/9833855/aef82aa4bd28/gr1_lrg.jpg

相似文献

MMViT-Seg: A lightweight transformer and CNN fusion network for COVID-19 segmentation.

Comput Methods Programs Biomed. 2023 Mar;230:107348. doi: 10.1016/j.cmpb.2023.107348. Epub 2023 Jan 12.

MC-DC: An MLP-CNN Based Dual-path Complementary Network for Medical Image Segmentation.

Comput Methods Programs Biomed. 2023 Dec;242:107846. doi: 10.1016/j.cmpb.2023.107846. Epub 2023 Oct 5.

MS-TCNet: An effective Transformer-CNN combined network using multi-scale feature learning for 3D medical image segmentation.

Comput Biol Med. 2024 Mar;170:108057. doi: 10.1016/j.compbiomed.2024.108057. Epub 2024 Jan 28.

Dual encoder network with transformer-CNN for multi-organ segmentation.

Med Biol Eng Comput. 2023 Mar;61(3):661-671. doi: 10.1007/s11517-022-02723-9. Epub 2022 Dec 29.

TGDAUNet: Transformer and GCNN based dual-branch attention UNet for medical image segmentation.

Comput Biol Med. 2023 Dec;167:107583. doi: 10.1016/j.compbiomed.2023.107583. Epub 2023 Oct 21.

Transformer guided self-adaptive network for multi-scale skin lesion image segmentation.

Comput Biol Med. 2024 Feb;169:107846. doi: 10.1016/j.compbiomed.2023.107846. Epub 2023 Dec 23.

MESTrans: Multi-scale embedding spatial transformer for medical image segmentation.

Comput Methods Programs Biomed. 2023 May;233:107493. doi: 10.1016/j.cmpb.2023.107493. Epub 2023 Mar 17.

MSCT-UNET: multi-scale contrastive transformer within U-shaped network for medical image segmentation.

Phys Med Biol. 2023 Dec 28;69(1). doi: 10.1088/1361-6560/ad135d.

ETUNet:Exploring efficient transformer enhanced UNet for 3D brain tumor segmentation.

Comput Biol Med. 2024 Mar;171:108005. doi: 10.1016/j.compbiomed.2024.108005. Epub 2024 Jan 23.

LM-Net: A light-weight and multi-scale network for medical image segmentation.

Comput Biol Med. 2024 Jan;168:107717. doi: 10.1016/j.compbiomed.2023.107717. Epub 2023 Nov 23.

引用本文的文献

Manual segmentation of opacities and consolidations on CT of long COVID patients from multiple annotators.

Sci Data. 2025 Mar 7;12(1):402. doi: 10.1038/s41597-025-04709-2.

Hybrid transformer-CNN and LSTM model for lung disease segmentation and classification.

PeerJ Comput Sci. 2024 Dec 13;10:e2444. doi: 10.7717/peerj-cs.2444. eCollection 2024.

MT-SCnet: multi-scale token divided and spatial-channel fusion transformer network for microscopic hyperspectral image segmentation.

Front Oncol. 2024 Dec 3;14:1469293. doi: 10.3389/fonc.2024.1469293. eCollection 2024.

Fully feature fusion based neural network for COVID-19 lesion segmentation in CT images.

Biomed Signal Process Control. 2023 Sep;86:104939. doi: 10.1016/j.bspc.2023.104939. Epub 2023 Apr 10.

本文引用的文献

MiniSeg: An Extremely Minimum Network Based on Lightweight Multiscale Learning for Efficient COVID-19 Segmentation.

IEEE Trans Neural Netw Learn Syst. 2024 Jun;35(6):8570-8584. doi: 10.1109/TNNLS.2022.3230821. Epub 2024 Jun 3.

A practical artificial intelligence system to diagnose COVID-19 using computed tomography: A multinational external validation study.

Pattern Recognit Lett. 2021 Dec;152:42-49. doi: 10.1016/j.patrec.2021.09.012. Epub 2021 Sep 23.

Inf-Net: Automatic COVID-19 Lung Infection Segmentation From CT Images.

IEEE Trans Med Imaging. 2020 Aug;39(8):2626-2637. doi: 10.1109/TMI.2020.2996645.

UNet++: A Nested U-Net Architecture for Medical Image Segmentation.

Deep Learn Med Image Anal Multimodal Learn Clin Decis Support (2018). 2018 Sep;11045:3-11. doi: 10.1007/978-3-030-00889-5_1. Epub 2018 Sep 20.

Chest CT manifestations of new coronavirus disease 2019 (COVID-19): a pictorial review.

Eur Radiol. 2020 Aug;30(8):4381-4389. doi: 10.1007/s00330-020-06801-0. Epub 2020 Mar 19.

A novel coronavirus outbreak of global health concern.

Lancet. 2020 Feb 15;395(10223):470-473. doi: 10.1016/S0140-6736(20)30185-9. Epub 2020 Jan 24.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

MMViT-Seg：一种用于 COVID-19 分割的轻量级Transformer 和 CNN 融合网络。

MMViT-Seg: A lightweight transformer and CNN fusion network for COVID-19 segmentation.

机构信息

出版信息

BACKGROUND AND OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景与目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献