用于高效语义分割的瀑布空洞空间池化架构。

Waterfall Atrous Spatial Pooling Architecture for Efficient Semantic Segmentation.

机构信息

Department of Computer Engineering, Rochester Institute of Technology, Rochester, NY 14623, USA.

出版信息

Sensors (Basel). 2019 Dec 5;19(24):5361. doi: 10.3390/s19245361.

DOI:10.3390/s19245361

PMID:31817366

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6960670/

Abstract

We propose a new efficient architecture for semantic segmentation, based on a "Waterfall" Atrous Spatial Pooling architecture, that achieves a considerable accuracy increase while decreasing the number of network parameters and memory footprint. The proposed Waterfall architecture leverages the efficiency of progressive filtering in the cascade architecture while maintaining multiscale fields-of-view comparable to spatial pyramid configurations. Additionally, our method does not rely on a postprocessing stage with Conditional Random Fields, which further reduces complexity and required training time. We demonstrate that the Waterfall approach with a ResNet backbone is a robust and efficient architecture for semantic segmentation obtaining state-of-the-art results with significant reduction in the number of parameters for the Pascal VOC dataset and the Cityscapes dataset.

摘要

我们提出了一种新的基于“瀑布”空洞空间池化架构的高效语义分割架构，在减少网络参数数量和内存占用的同时，显著提高了准确性。所提出的瀑布架构利用级联架构中渐进式滤波的效率，同时保持与空间金字塔配置相当的多尺度视野。此外，我们的方法不依赖于具有条件随机场的后处理阶段，这进一步降低了复杂性和所需的训练时间。我们证明，基于 ResNet 骨干的瀑布方法是一种强大且高效的语义分割架构，在减少参数数量的同时，在 Pascal VOC 数据集和 Cityscapes 数据集上获得了最先进的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/838d/6960670/f97a5a518084/sensors-19-05361-g001.jpg

相似文献

Waterfall Atrous Spatial Pooling Architecture for Efficient Semantic Segmentation.用于高效语义分割的瀑布空洞空间池化架构。

Sensors (Basel). 2019 Dec 5;19(24):5361. doi: 10.3390/s19245361.

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.DeepLab：基于深度卷积网络、空洞卷积和全连接条件随机场的语义图像分割。

IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.

GourmetNet: Food Segmentation Using Multi-Scale Waterfall Features with Spatial and Channel Attention.美食网：使用多尺度瀑布特征与空间和通道注意力的食品分割。

Sensors (Basel). 2021 Nov 11;21(22):7504. doi: 10.3390/s21227504.

Full-BAPose: Bottom Up Framework for Full Body Pose Estimation.全 BAPose：用于全身姿势估计的自底向上框架。

Sensors (Basel). 2023 Apr 4;23(7):3725. doi: 10.3390/s23073725.

Novel Method of Semantic Segmentation Applicable to Augmented Reality.适用于增强现实的语义分割新方法。

Sensors (Basel). 2020 Mar 20;20(6):1737. doi: 10.3390/s20061737.

UniPose+: A Unified Framework for 2D and 3D Human Pose Estimation in Images and Videos.UniPose+：一种用于图像和视频中 2D 和 3D 人体姿态估计的统一框架。

IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):9641-9653. doi: 10.1109/TPAMI.2021.3124736. Epub 2022 Nov 7.

A multiple-channel and atrous convolution network for ultrasound image segmentation.一种用于超声图像分割的多通道多孔卷积网络。

Med Phys. 2020 Dec;47(12):6270-6285. doi: 10.1002/mp.14512. Epub 2020 Oct 18.

Fast semantic segmentation method for machine vision inspection based on a fewer-parameters atrous convolution neural network.基于少参数量孔卷积神经网络的机器视觉检测快速语义分割方法。

PLoS One. 2021 Feb 10;16(2):e0246093. doi: 10.1371/journal.pone.0246093. eCollection 2021.

BMSeNet: Multiscale Context Pyramid Pooling and Spatial Detail Enhancement Network for Real-Time Semantic Segmentation.BMSeNet：用于实时语义分割的多尺度上下文金字塔池化与空间细节增强网络

Sensors (Basel). 2024 Aug 9;24(16):5145. doi: 10.3390/s24165145.

Cascaded atrous convolution and spatial pyramid pooling for more accurate tumor target segmentation for rectal cancer radiotherapy.级联空洞卷积和空间金字塔池化以提高直肠癌放疗中肿瘤靶区分割的准确性。

Phys Med Biol. 2018 Sep 17;63(18):185016. doi: 10.1088/1361-6560/aada6c.

引用本文的文献

A New Algorithm for Visual Navigation in Unmanned Aerial Vehicle Water Surface Inspection.一种用于无人机水面检测视觉导航的新算法。

Sensors (Basel). 2025 Apr 20;25(8):2600. doi: 10.3390/s25082600.

DeSPPNet: A Multiscale Deep Learning Model for Cardiac Segmentation.DeSPPNet：一种用于心脏分割的多尺度深度学习模型。

Diagnostics (Basel). 2024 Dec 14;14(24):2820. doi: 10.3390/diagnostics14242820.

Full-BAPose: Bottom Up Framework for Full Body Pose Estimation.全 BAPose：用于全身姿势估计的自底向上框架。

Sensors (Basel). 2023 Apr 4;23(7):3725. doi: 10.3390/s23073725.

Joint-Based Action Progress Prediction.基于关节的动作进展预测。

Sensors (Basel). 2023 Jan 3;23(1):520. doi: 10.3390/s23010520.

Micro-Expression-Based Emotion Recognition Using Waterfall Atrous Spatial Pyramid Pooling Networks.基于微表情的情感识别：使用瀑布型空洞空间金字塔池化网络。

Sensors (Basel). 2022 Jun 19;22(12):4634. doi: 10.3390/s22124634.

GourmetNet: Food Segmentation Using Multi-Scale Waterfall Features with Spatial and Channel Attention.美食网：使用多尺度瀑布特征与空间和通道注意力的食品分割。

Sensors (Basel). 2021 Nov 11;21(22):7504. doi: 10.3390/s21227504.

Application of Deep Convolution Network to Automated Image Segmentation of Chest CT for Patients With Tumor.深度卷积网络在肿瘤患者胸部CT图像自动分割中的应用。

Front Oncol. 2021 Sep 29;11:719398. doi: 10.3389/fonc.2021.719398. eCollection 2021.

A feature fusion deep-projection convolution neural network for vehicle detection in aerial images.一种用于航空图像中车辆检测的特征融合深度投影卷积神经网络。

PLoS One. 2021 May 7;16(5):e0250782. doi: 10.1371/journal.pone.0250782. eCollection 2021.

Image Segmentation Using Encoder-Decoder with Deformable Convolutions.基于可变形卷积的编解码图像分割。

Sensors (Basel). 2021 Feb 24;21(5):1570. doi: 10.3390/s21051570.

Advanced Computational Intelligence for Object Detection, Feature Extraction and Recognition in Smart Sensor Environments.智能传感器环境中的目标检测、特征提取和识别的高级计算智能。

Sensors (Basel). 2020 Dec 24;21(1):45. doi: 10.3390/s21010045.

本文引用的文献

Res2Net: A New Multi-Scale Backbone Architecture.Res2Net：一种新的多尺度骨干网络架构。

IEEE Trans Pattern Anal Mach Intell. 2021 Feb;43(2):652-662. doi: 10.1109/TPAMI.2019.2938758. Epub 2021 Jan 8.

Squeeze-and-Excitation Networks.挤压激励网络。

IEEE Trans Pattern Anal Mach Intell. 2020 Aug;42(8):2011-2023. doi: 10.1109/TPAMI.2019.2913372. Epub 2019 Apr 29.

Mask R-CNN.Mask R-CNN。

IEEE Trans Pattern Anal Mach Intell. 2020 Feb;42(2):386-397. doi: 10.1109/TPAMI.2018.2844175. Epub 2018 Jun 5.

IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation.SegNet：一种用于图像分割的深度卷积编解码器架构。

IEEE Trans Pattern Anal Mach Intell. 2017 Dec;39(12):2481-2495. doi: 10.1109/TPAMI.2016.2644615. Epub 2017 Jan 2.

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.空间金字塔池化在深度卷积网络中的视觉识别。

IEEE Trans Pattern Anal Mach Intell. 2015 Sep;37(9):1904-16. doi: 10.1109/TPAMI.2015.2389824.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于高效语义分割的瀑布空洞空间池化架构。

Waterfall Atrous Spatial Pooling Architecture for Efficient Semantic Segmentation.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献