一种用于实时语义分割的快速注意力引导分层解码网络。

A Fast Attention-Guided Hierarchical Decoding Network for Real-Time Semantic Segmentation.

作者信息

Hu Xuegang, Feng Jing

机构信息

School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China.

Chongqing Key Laboratory of Signal and Information Processing, Chongqing University of Posts and Telecommunications, Chongqing 400065, China.

出版信息

Sensors (Basel). 2023 Dec 24;24(1):95. doi: 10.3390/s24010095.

DOI:10.3390/s24010095

PMID:38202957

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10781398/

Abstract

Semantic segmentation provides accurate scene understanding and decision support for many applications. However, many models strive for high accuracy by adopting complex structures, decreasing the inference speed, and making it challenging to meet real-time requirements. Therefore, a fast attention-guided hierarchical decoding network for real-time semantic segmentation (FAHDNet), which is an asymmetric U-shaped structure, is proposed to address this issue. In the encoder, we design a multi-scale bottleneck residual unit (MBRU), which combines the attention mechanism and decomposition convolution to design a parallel structure for aggregating multi-scale information, making the network perform better at processing information at different scales. In addition, we propose a spatial information compensation (SIC) module that effectively uses the original input to make up for the spatial texture information lost during downsampling. In the decoder, the global attention (GA) module is used to process the feature map of the encoder, enhance the feature interaction in the channel and spatial dimensions, and enhance the ability to mine feature information. At the same time, the lightweight hierarchical decoder integrates multi-scale features to better adapt to different scale targets and accurately segment objects of different sizes. Through experiments, FAHDNet performs outstandingly on two public datasets, Cityscapes and Camvid. Specifically, the network achieves 70.6% mean intersection over union (mIoU) at 135 frames per second (FPS) on Cityscapes and 67.2% mIoU at 335 FPS on Camvid. Compared to the existing networks, our model maintains accuracy while achieving faster inference speeds, thus enhancing its practical usability.

摘要

语义分割为许多应用提供了准确的场景理解和决策支持。然而，许多模型通过采用复杂的结构来追求高精度，这降低了推理速度，使得满足实时需求具有挑战性。因此，提出了一种用于实时语义分割的快速注意力引导分层解码网络（FAHDNet），它是一种不对称的U形结构，以解决这个问题。在编码器中，我们设计了一种多尺度瓶颈残差单元（MBRU），它将注意力机制和分解卷积相结合，设计了一种用于聚合多尺度信息的并行结构，使网络在处理不同尺度的信息时表现更好。此外，我们提出了一种空间信息补偿（SIC）模块，该模块有效地利用原始输入来弥补下采样过程中丢失的空间纹理信息。在解码器中，全局注意力（GA）模块用于处理编码器的特征图，增强通道和空间维度上的特征交互，并增强挖掘特征信息的能力。同时，轻量级分层解码器集成多尺度特征，以更好地适应不同尺度的目标，并准确分割不同大小的物体。通过实验，FAHDNet在两个公共数据集Cityscapes和Camvid上表现出色。具体来说，该网络在Cityscapes上以每秒135帧（FPS）的速度实现了70.6%的平均交并比（mIoU），在Camvid上以335 FPS的速度实现了67.2%的mIoU。与现有网络相比，我们的模型在保持精度的同时实现了更快的推理速度，从而提高了其实际可用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/87ca/10781398/a9d1169e7440/sensors-24-00095-g001.jpg

相似文献

A Fast Attention-Guided Hierarchical Decoding Network for Real-Time Semantic Segmentation.

Sensors (Basel). 2023 Dec 24;24(1):95. doi: 10.3390/s24010095.

A lightweight multi-dimension dynamic convolutional network for real-time semantic segmentation.

Front Neurorobot. 2022 Dec 15;16:1075520. doi: 10.3389/fnbot.2022.1075520. eCollection 2022.

Lightweight semantic segmentation network with configurable context and small object attention.

Front Comput Neurosci. 2023 Oct 23;17:1280640. doi: 10.3389/fncom.2023.1280640. eCollection 2023.

Bilateral attention decoder: A lightweight decoder for real-time semantic segmentation.

Neural Netw. 2021 May;137:188-199. doi: 10.1016/j.neunet.2021.01.021. Epub 2021 Jan 30.

MFAFNet: A Lightweight and Efficient Network with Multi-Level Feature Adaptive Fusion for Real-Time Semantic Segmentation.

Sensors (Basel). 2023 Jul 13;23(14):6382. doi: 10.3390/s23146382.

Faster SCDNet: Real-Time Semantic Segmentation Network with Split Connection and Flexible Dilated Convolution.

Sensors (Basel). 2023 Mar 14;23(6):3112. doi: 10.3390/s23063112.

Rethinking 1D convolution for lightweight semantic segmentation.

Front Neurorobot. 2023 Feb 9;17:1119231. doi: 10.3389/fnbot.2023.1119231. eCollection 2023.

Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes.

Front Neurorobot. 2023 Aug 31;17:1204418. doi: 10.3389/fnbot.2023.1204418. eCollection 2023.

LMFFNet: A Well-Balanced Lightweight Network for Fast and Accurate Semantic Segmentation.

IEEE Trans Neural Netw Learn Syst. 2023 Jun;34(6):3205-3219. doi: 10.1109/TNNLS.2022.3176493. Epub 2023 Jun 1.

A Hierarchical Feature Extraction Network for Fast Scene Segmentation.

Sensors (Basel). 2021 Nov 20;21(22):7730. doi: 10.3390/s21227730.

本文引用的文献

MFAFNet: A Lightweight and Efficient Network with Multi-Level Feature Adaptive Fusion for Real-Time Semantic Segmentation.

Sensors (Basel). 2023 Jul 13;23(14):6382. doi: 10.3390/s23146382.

SFA-Net: A Selective Features Absorption Network for Object Detection in Rainy Weather Conditions.

IEEE Trans Neural Netw Learn Syst. 2023 Aug;34(8):5122-5132. doi: 10.1109/TNNLS.2021.3125679. Epub 2023 Aug 4.

A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects.

IEEE Trans Neural Netw Learn Syst. 2022 Dec;33(12):6999-7019. doi: 10.1109/TNNLS.2021.3084827. Epub 2022 Nov 30.

CGNet: A Light-Weight Context Guided Network for Semantic Segmentation.

IEEE Trans Image Process. 2021;30:1169-1179. doi: 10.1109/TIP.2020.3042065. Epub 2020 Dec 17.

MS-CAM: Multi-Scale Class Activation Maps for Weakly-Supervised Segmentation of Geographic Atrophy Lesions in SD-OCT Images.

IEEE J Biomed Health Inform. 2020 Dec;24(12):3443-3455. doi: 10.1109/JBHI.2020.2999588. Epub 2020 Dec 4.

Context-Integrated and Feature-Refined Network for Lightweight Object Parsing.

IEEE Trans Image Process. 2020 Mar 11. doi: 10.1109/TIP.2020.2978583.

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.

IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation.

IEEE Trans Pattern Anal Mach Intell. 2017 Dec;39(12):2481-2495. doi: 10.1109/TPAMI.2016.2644615. Epub 2017 Jan 2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于实时语义分割的快速注意力引导分层解码网络。

A Fast Attention-Guided Hierarchical Decoding Network for Real-Time Semantic Segmentation.

作者信息

Hu Xuegang, Feng Jing

机构信息

School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China.

Chongqing Key Laboratory of Signal and Information Processing, Chongqing University of Posts and Telecommunications, Chongqing 400065, China.

出版信息

Sensors (Basel). 2023 Dec 24;24(1):95. doi: 10.3390/s24010095.

DOI:10.3390/s24010095

PMID:38202957

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10781398/

Abstract

摘要

一种用于实时语义分割的快速注意力引导分层解码网络。

A Fast Attention-Guided Hierarchical Decoding Network for Real-Time Semantic Segmentation.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

一种用于实时语义分割的快速注意力引导分层解码网络。

A Fast Attention-Guided Hierarchical Decoding Network for Real-Time Semantic Segmentation.

作者信息

机构信息

出版信息

相似文献

本文引用的文献