Suppr
超能文献

用于自动驾驶的双编码器和自注意力机制的实时语义分割。

Real-Time Semantic Segmentation with Dual Encoder and Self-Attention Mechanism for Autonomous Driving.

机构信息

Department of Electronic and Computer Engineering, National Taiwan University of Science and Technology, Taipei City 106, Taiwan.

出版信息

Sensors (Basel). 2021 Dec 2;21(23):8072. doi: 10.3390/s21238072.

DOI:10.3390/s21238072

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8659896/

Abstract

As the techniques of autonomous driving become increasingly valued and universal, real-time semantic segmentation has become very popular and challenging in the field of deep learning and computer vision in recent years. However, in order to apply the deep learning model to edge devices accompanying sensors on vehicles, we need to design a structure that has the best trade-off between accuracy and inference time. In previous works, several methods sacrificed accuracy to obtain a faster inference time, while others aimed to find the best accuracy under the condition of real time. Nevertheless, the accuracies of previous real-time semantic segmentation methods still have a large gap compared to general semantic segmentation methods. As a result, we propose a network architecture based on a dual encoder and a self-attention mechanism. Compared with preceding works, we achieved a 78.6% mIoU with a speed of 39.4 FPS with a 1024 × 2048 resolution on a Cityscapes test submission.

摘要

随着自动驾驶技术的日益受到重视和普及，实时语义分割在近年来已成为深度学习和计算机视觉领域中非常热门且具有挑战性的课题。然而，为了将深度学习模型应用于搭载车辆传感器的边缘设备，我们需要设计一种在准确性和推理时间之间具有最佳权衡的结构。在之前的工作中，有几种方法为了获得更快的推理时间而牺牲了准确性，而另一些方法则旨在在实时条件下找到最佳的准确性。然而，与一般的语义分割方法相比，之前的实时语义分割方法的准确性仍存在较大差距。因此，我们提出了一种基于双编码器和自注意力机制的网络架构。与之前的工作相比，我们在 Cityscapes 测试提交中实现了 1024×2048 分辨率下 39.4 FPS 的速度，mIoU 达到了 78.6%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cfea/8659896/9a5e413da856/sensors-21-08072-g001.jpg

相似文献

1

Real-Time Semantic Segmentation with Dual Encoder and Self-Attention Mechanism for Autonomous Driving.

Sensors (Basel). 2021 Dec 2;21(23):8072. doi: 10.3390/s21238072.

2

Improving Semantic Segmentation of Urban Scenes for Self-Driving Cars with Synthetic Images.

Sensors (Basel). 2022 Mar 14;22(6):2252. doi: 10.3390/s22062252.

3

A lightweight multi-dimension dynamic convolutional network for real-time semantic segmentation.

Front Neurorobot. 2022 Dec 15;16:1075520. doi: 10.3389/fnbot.2022.1075520. eCollection 2022.

4

Performance estimation for the memristor-based computing-in-memory implementation of extremely factorized network for real-time and low-power semantic segmentation.

Neural Netw. 2023 Mar;160:202-215. doi: 10.1016/j.neunet.2023.01.008. Epub 2023 Jan 13.

5

Fast Panoptic Segmentation with Soft Attention Embeddings.

Sensors (Basel). 2022 Jan 20;22(3):783. doi: 10.3390/s22030783.

6

Enhancing Mask Transformer with Auxiliary Convolution Layers for Semantic Segmentation.

Sensors (Basel). 2023 Jan 4;23(2):581. doi: 10.3390/s23020581.

7

Multiple-Attention Mechanism Network for Semantic Segmentation.

Sensors (Basel). 2022 Jun 13;22(12):4477. doi: 10.3390/s22124477.

8

A Hierarchical Feature Extraction Network for Fast Scene Segmentation.

Sensors (Basel). 2021 Nov 20;21(22):7730. doi: 10.3390/s21227730.

9

Image Semantic Segmentation Method Based on Deep Fusion Network and Conditional Random Field.

Comput Intell Neurosci. 2022 May 14;2022:8961456. doi: 10.1155/2022/8961456. eCollection 2022.

10

Intelligent Semantic Segmentation for Self-Driving Vehicles Using Deep Learning.

Comput Intell Neurosci. 2022 Jan 17;2022:6390260. doi: 10.1155/2022/6390260. eCollection 2022.

引用本文的文献

1

PDC: Pearl Detection with a Counter Based on Deep Learning.

Sensors (Basel). 2022 Sep 16;22(18):7026. doi: 10.3390/s22187026.

本文引用的文献

1

Coarse-to-Fine Semantic Segmentation From Image-Level Labels.

IEEE Trans Image Process. 2020;29:225-236. doi: 10.1109/TIP.2019.2926748. Epub 2019 Jul 12.

2

Recalibrating Fully Convolutional Networks With Spatial and Channel "Squeeze and Excitation" Blocks.

IEEE Trans Med Imaging. 2019 Feb;38(2):540-549. doi: 10.1109/TMI.2018.2867261.

3

Deep Learning for Electromyographic Hand Gesture Signal Classification Using Transfer Learning.

IEEE Trans Neural Syst Rehabil Eng. 2019 Apr;27(4):760-771. doi: 10.1109/TNSRE.2019.2896269. Epub 2019 Jan 31.

4

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.

IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.

5

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation.

IEEE Trans Pattern Anal Mach Intell. 2017 Dec;39(12):2481-2495. doi: 10.1109/TPAMI.2016.2644615. Epub 2017 Jan 2.

6

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

文档翻译

学术文献翻译模型，支持多种主流文档格式。