• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

车载场景语义分割的多层次多尺度特征聚合网络。

Multi-Level and Multi-Scale Feature Aggregation Network for Semantic Segmentation in Vehicle-Mounted Scenes.

机构信息

School of Software Engineering, South China University of Technology, Guangzhou 510006, China.

出版信息

Sensors (Basel). 2021 May 9;21(9):3270. doi: 10.3390/s21093270.

DOI:10.3390/s21093270
PMID:34065155
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8126014/
Abstract

The main challenges of semantic segmentation in vehicle-mounted scenes are object scale variation and trading off model accuracy and efficiency. Lightweight backbone networks for semantic segmentation usually extract single-scale features layer-by-layer only by using a fixed receptive field. Most modern real-time semantic segmentation networks heavily compromise spatial details when encoding semantics, and sacrifice accuracy for speed. Many improving strategies adopt dilated convolution and add a sub-network, in which either intensive computation or redundant parameters are brought. We propose a multi-level and multi-scale feature aggregation network (MMFANet). A spatial pyramid module is designed by cascading dilated convolutions with different receptive fields to extract multi-scale features layer-by-layer. Subseqently, a lightweight backbone network is built by reducing the feature channel capacity of the module. To improve the accuracy of our network, we design two additional modules to separately capture spatial details and high-level semantics from the backbone network without significantly increasing the computation cost. Comprehensive experimental results show that our model achieves 79.3% MIoU on the Cityscapes test dataset at a speed of 58.5 FPS, and it is more accurate than SwiftNet (75.5% MIoU). Furthermore, the number of parameters of our model is at least 53.38% less than that of other state-of-the-art models.

摘要

车载场景下语义分割的主要挑战是目标尺度变化以及模型精度和效率的权衡。用于语义分割的轻量级骨干网络通常仅通过使用固定感受野逐层提取单尺度特征。大多数现代实时语义分割网络在编码语义时严重牺牲空间细节,为了速度而牺牲精度。许多改进策略采用空洞卷积并添加子网络,这会带来密集的计算或冗余参数。我们提出了一种多级多尺度特征聚合网络(MMFANet)。通过级联具有不同感受野的空洞卷积来设计一个空间金字塔模块,以逐层提取多尺度特征。随后,通过减少模块的特征通道容量来构建轻量级骨干网络。为了提高我们网络的准确性,我们设计了两个额外的模块,分别从骨干网络中捕获空间细节和高级语义,而不会显著增加计算成本。全面的实验结果表明,我们的模型在 Cityscapes 测试数据集上的 MIoU 达到 79.3%,速度为 58.5 FPS,比 SwiftNet(75.5% MIoU)更准确。此外,我们模型的参数量比其他最先进的模型至少少 53.38%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/d96e8e1fc9ac/sensors-21-03270-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/6c8392ca7c05/sensors-21-03270-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/5121cc5c9f30/sensors-21-03270-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/963ab9052411/sensors-21-03270-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/37e2f66ecd14/sensors-21-03270-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/2b17305ecbc6/sensors-21-03270-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/af51831981a4/sensors-21-03270-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/f5fc4eb2900b/sensors-21-03270-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/d96e8e1fc9ac/sensors-21-03270-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/6c8392ca7c05/sensors-21-03270-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/5121cc5c9f30/sensors-21-03270-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/963ab9052411/sensors-21-03270-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/37e2f66ecd14/sensors-21-03270-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/2b17305ecbc6/sensors-21-03270-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/af51831981a4/sensors-21-03270-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/f5fc4eb2900b/sensors-21-03270-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2de2/8126014/d96e8e1fc9ac/sensors-21-03270-g008.jpg

相似文献

1
Multi-Level and Multi-Scale Feature Aggregation Network for Semantic Segmentation in Vehicle-Mounted Scenes.车载场景语义分割的多层次多尺度特征聚合网络。
Sensors (Basel). 2021 May 9;21(9):3270. doi: 10.3390/s21093270.
2
Rethinking 1D convolution for lightweight semantic segmentation.重新思考用于轻量级语义分割的一维卷积
Front Neurorobot. 2023 Feb 9;17:1119231. doi: 10.3389/fnbot.2023.1119231. eCollection 2023.
3
Faster SCDNet: Real-Time Semantic Segmentation Network with Split Connection and Flexible Dilated Convolution.更快的 SCDNet:具有分割连接和灵活空洞卷积的实时语义分割网络。
Sensors (Basel). 2023 Mar 14;23(6):3112. doi: 10.3390/s23063112.
4
MFAFNet: A Lightweight and Efficient Network with Multi-Level Feature Adaptive Fusion for Real-Time Semantic Segmentation.MFAFNet:一种用于实时语义分割的具有多级特征自适应融合的轻量级高效网络。
Sensors (Basel). 2023 Jul 13;23(14):6382. doi: 10.3390/s23146382.
5
A lightweight multi-dimension dynamic convolutional network for real-time semantic segmentation.一种用于实时语义分割的轻量级多维动态卷积网络。
Front Neurorobot. 2022 Dec 15;16:1075520. doi: 10.3389/fnbot.2022.1075520. eCollection 2022.
6
Lightweight medical image segmentation network with multi-scale feature-guided fusion.轻量级医疗图像分割网络,具有多尺度特征引导融合。
Comput Biol Med. 2024 Nov;182:109204. doi: 10.1016/j.compbiomed.2024.109204. Epub 2024 Oct 3.
7
Bilateral attention decoder: A lightweight decoder for real-time semantic segmentation.双边注意解码器:用于实时语义分割的轻量级解码器。
Neural Netw. 2021 May;137:188-199. doi: 10.1016/j.neunet.2021.01.021. Epub 2021 Jan 30.
8
Context and Spatial Feature Calibration for Real-Time Semantic Segmentation.用于实时语义分割的上下文和空间特征校准
IEEE Trans Image Process. 2023;32:5465-5477. doi: 10.1109/TIP.2023.3318967. Epub 2023 Oct 25.
9
DMCT-Net: dual modules convolution transformer network for head and neck tumor segmentation in PET/CT.DMCT-Net:用于 PET/CT 中头颈部肿瘤分割的双模块卷积变换网络。
Phys Med Biol. 2023 May 22;68(11). doi: 10.1088/1361-6560/acd29f.
10
Lightweight semantic segmentation network with configurable context and small object attention.具有可配置上下文和小目标注意力的轻量级语义分割网络。
Front Comput Neurosci. 2023 Oct 23;17:1280640. doi: 10.3389/fncom.2023.1280640. eCollection 2023.

引用本文的文献

1
Lane and Road Marker Semantic Video Segmentation Using Mask Cropping and Optical Flow Estimation.基于掩模裁剪和光流估计的车道和路牌语义视频分割。
Sensors (Basel). 2021 Oct 28;21(21):7156. doi: 10.3390/s21217156.

本文引用的文献

1
Res2Net: A New Multi-Scale Backbone Architecture.Res2Net:一种新的多尺度骨干网络架构。
IEEE Trans Pattern Anal Mach Intell. 2021 Feb;43(2):652-662. doi: 10.1109/TPAMI.2019.2938758. Epub 2021 Jan 8.
2
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.DeepLab:基于深度卷积网络、空洞卷积和全连接条件随机场的语义图像分割。
IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.
3
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation.
SegNet:一种用于图像分割的深度卷积编解码器架构。
IEEE Trans Pattern Anal Mach Intell. 2017 Dec;39(12):2481-2495. doi: 10.1109/TPAMI.2016.2644615. Epub 2017 Jan 2.
4
Fully Convolutional Networks for Semantic Segmentation.全卷积网络用于语义分割。
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):640-651. doi: 10.1109/TPAMI.2016.2572683. Epub 2016 May 24.
5
Learning hierarchical features for scene labeling.学习用于场景标注的层次特征。
IEEE Trans Pattern Anal Mach Intell. 2013 Aug;35(8):1915-29. doi: 10.1109/TPAMI.2012.231.