NDNet：通过邻域解耦进行空间多尺度表示学习以实现实时驾驶场景解析

NDNet: Spacewise Multiscale Representation Learning via Neighbor Decoupling for Real-Time Driving Scene Parsing.

作者信息

Li Shu, Yan Qingqing, Zhou Xun, Wang Deming, Liu Chengju, Chen Qijun

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Jun;35(6):7884-7898. doi: 10.1109/TNNLS.2022.3221745. Epub 2024 Jun 3.

DOI:10.1109/TNNLS.2022.3221745

Abstract

As a safety-critical application, autonomous driving requires high-quality semantic segmentation and real-time performance for deployment. Existing method commonly suffers from information loss and massive computational burden due to high-resolution input-output and multiscale learning scheme, which runs counter to the real-time requirements. In contrast to channelwise information modeling commonly adopted by modern networks, in this article, we propose a novel real-time driving scene parsing framework named NDNet from a novel perspective of spacewise neighbor decoupling (ND) and neighbor coupling (NC). We first define and implement the reversible operations called ND and NC, which realize lossless resolution conversion for complementary thumbnails sampling and collation to facilitate spatial modeling. Based on ND and NC, we further propose three modules, namely, local capturer and global dependence builder (LCGB), spacewise multiscale feature extractor (SMFE), and high-resolution semantic generator (HSG), which form the whole pipeline of NDNet. The LCGB serves as a stem block to preprocess the large-scale input for fast but lossless resolution reduction and extract initial features with global context. Then the SMFE is used for dense feature extraction and can obtain rich multiscale features in spatial dimension with less computational overhead. As for high-resolution semantic output, the HSG is designed for fast resolution reconstruction and adaptive semantic confusion amending. Experiments show the superiority of the proposed method. NDNet achieves the state-of-the-art performance on the Cityscapes dataset which reports 76.47% mIoU at 240+ frames/s and 78.8% mIoU at 150+ frames/s on the benchmark. Codes are available at https://github.com/LiShuTJ/NDNet.

摘要

作为一种安全关键型应用，自动驾驶在部署时需要高质量的语义分割和实时性能。由于高分辨率的输入输出和多尺度学习方案，现有方法通常存在信息丢失和巨大的计算负担，这与实时要求背道而驰。与现代网络普遍采用的通道级信息建模不同，在本文中，我们从空间邻域解耦（ND）和邻域耦合（NC）的新视角提出了一种名为NDNet的新型实时驾驶场景解析框架。我们首先定义并实现了名为ND和NC的可逆操作，它们实现了互补缩略图采样和整理的无损分辨率转换，以促进空间建模。基于ND和NC，我们进一步提出了三个模块，即局部捕获器和全局依赖性构建器（LCGB）、空间多尺度特征提取器（SMFE）和高分辨率语义生成器（HSG），它们构成了NDNet的整个流程。LCGB作为一个主干模块，对大规模输入进行预处理，以实现快速但无损的分辨率降低，并提取具有全局上下文的初始特征。然后，SMFE用于密集特征提取，并且可以在空间维度上以较少的计算开销获得丰富的多尺度特征。至于高分辨率语义输出，HSG旨在快速分辨率重建和自适应语义混淆修正。实验表明了所提方法的优越性。NDNet在Cityscapes数据集上达到了当前最优性能，在基准测试中，在240+帧/秒时报告的平均交并比为76.47%，在150+帧/秒时为78.8%。代码可在https://github.com/LiShuTJ/NDNet获取。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

NDNet：通过邻域解耦进行空间多尺度表示学习以实现实时驾驶场景解析

NDNet: Spacewise Multiscale Representation Learning via Neighbor Decoupling for Real-Time Driving Scene Parsing.

作者信息

出版信息

相似文献

NDNet：通过邻域解耦进行空间多尺度表示学习以实现实时驾驶场景解析

NDNet: Spacewise Multiscale Representation Learning via Neighbor Decoupling for Real-Time Driving Scene Parsing.

作者信息

出版信息

相似文献