Suppr
超能文献

使用混合卷积神经网络从单张RGB图像进行深度估计和语义分割

Depth Estimation and Semantic Segmentation from a Single RGB Image Using a Hybrid Convolutional Neural Network.

作者信息

Lin Xiao, Sánchez-Escobedo Dalila, Casas Josep R, Pardàs Montse

机构信息

Visual Interactions and Communication Technologies (Vicomtech), 20009 Donostia/San Sebastián, Spain.

Image Processing Group, TSC Department, Technical University of Catalonia (UPC), 08034 Barcelona, Spain.

出版信息

Sensors (Basel). 2019 Apr 15;19(8):1795. doi: 10.3390/s19081795.

DOI:10.3390/s19081795

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6514714/

Abstract

Semantic segmentation and depth estimation are two important tasks in computer vision, and many methods have been developed to tackle them. Commonly these two tasks are addressed independently, but recently the idea of merging these two problems into a sole framework has been studied under the assumption that integrating two highly correlated tasks may benefit each other to improve the estimation accuracy. In this paper, depth estimation and semantic segmentation are jointly addressed using a single RGB input image under a unified convolutional neural network. We analyze two different architectures to evaluate which features are more relevant when shared by the two tasks and which features should be kept separated to achieve a mutual improvement. Likewise, our approaches are evaluated under two different scenarios designed to review our results versus single-task and multi-task methods. Qualitative and quantitative experiments demonstrate that the performance of our methodology outperforms the state of the art on single-task approaches, while obtaining competitive results compared with other multi-task methods.

摘要

语义分割和深度估计是计算机视觉中的两项重要任务，并且已经开发了许多方法来处理它们。通常这两项任务是独立处理的，但最近在整合两个高度相关的任务可能会相互受益以提高估计精度的假设下，将这两个问题合并到一个单独框架中的想法得到了研究。在本文中，深度估计和语义分割在统一的卷积神经网络下使用单个RGB输入图像进行联合处理。我们分析了两种不同的架构，以评估哪些特征在由这两项任务共享时更相关，以及哪些特征应该保持分离以实现相互改进。同样，我们的方法在两种不同的场景下进行评估，旨在将我们的结果与单任务和多任务方法进行比较。定性和定量实验表明，我们方法的性能优于单任务方法的现有技术水平，同时与其他多任务方法相比获得了有竞争力的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c14/6514714/161fed99d435/sensors-19-01795-g001.jpg

相似文献

1

Depth Estimation and Semantic Segmentation from a Single RGB Image Using a Hybrid Convolutional Neural Network.

Sensors (Basel). 2019 Apr 15;19(8):1795. doi: 10.3390/s19081795.

2

Collaborative Deconvolutional Neural Networks for Joint Depth Estimation and Semantic Segmentation.

IEEE Trans Neural Netw Learn Syst. 2018 Nov;29(11):5655-5666. doi: 10.1109/TNNLS.2017.2787781. Epub 2018 Mar 20.

3

Semantic Segmentation Leveraging Simultaneous Depth Estimation.

Sensors (Basel). 2021 Jan 20;21(3):690. doi: 10.3390/s21030690.

4

DTS-Net: Depth-to-Space Networks for Fast and Accurate Semantic Object Segmentation.

Sensors (Basel). 2022 Jan 3;22(1):337. doi: 10.3390/s22010337.

5

Exploiting Depth From Single Monocular Images for Object Detection and Semantic Segmentation.

IEEE Trans Image Process. 2017 Feb;26(2):836-846. doi: 10.1109/TIP.2016.2621673. Epub 2016 Oct 26.

6

Improving Depth Estimation by Embedding Semantic Segmentation: A Hybrid CNN Model.

Sensors (Basel). 2022 Feb 21;22(4):1669. doi: 10.3390/s22041669.

7

SCN: Switchable Context Network for Semantic Segmentation of RGB-D Images.

IEEE Trans Cybern. 2020 Mar;50(3):1120-1131. doi: 10.1109/TCYB.2018.2885062. Epub 2018 Dec 20.

8

Latent 3D Volume for Joint Depth Estimation and Semantic Segmentation from a Single Image.

Sensors (Basel). 2020 Oct 12;20(20):5765. doi: 10.3390/s20205765.

9

Semantic Segmentation and Depth Estimation Based on Residual Attention Mechanism.

Sensors (Basel). 2023 Aug 28;23(17):7466. doi: 10.3390/s23177466.

10

RT-ViT: Real-Time Monocular Depth Estimation Using Lightweight Vision Transformers.

Sensors (Basel). 2022 May 19;22(10):3849. doi: 10.3390/s22103849.

引用本文的文献

1

Unification of Road Scene Segmentation Strategies Using Multistream Data and Latent Space Attention.

Sensors (Basel). 2023 Aug 23;23(17):7355. doi: 10.3390/s23177355.

2

Monocular Depth Estimation: Lightweight Convolutional and Matrix Capsule Feature-Fusion Network.

Sensors (Basel). 2022 Aug 23;22(17):6344. doi: 10.3390/s22176344.

3

Recent Advanced Deep Learning Architectures for Retinal Fluid Segmentation on Optical Coherence Tomography Images.

Sensors (Basel). 2022 Apr 15;22(8):3055. doi: 10.3390/s22083055.

4

Improving Depth Estimation by Embedding Semantic Segmentation: A Hybrid CNN Model.

Sensors (Basel). 2022 Feb 21;22(4):1669. doi: 10.3390/s22041669.

5

DTS-Net: Depth-to-Space Networks for Fast and Accurate Semantic Object Segmentation.

Sensors (Basel). 2022 Jan 3;22(1):337. doi: 10.3390/s22010337.

6

Monocular Depth Estimation with Joint Attention Feature Distillation and Wavelet-Based Loss Function.

Sensors (Basel). 2020 Dec 24;21(1):54. doi: 10.3390/s21010054.

7

Latent 3D Volume for Joint Depth Estimation and Semantic Segmentation from a Single Image.

Sensors (Basel). 2020 Oct 12;20(20):5765. doi: 10.3390/s20205765.

8

Multi-Scale Spatio-Temporal Feature Extraction and Depth Estimation from Sequences by Ordinal Classification.

Sensors (Basel). 2020 Apr 1;20(7):1979. doi: 10.3390/s20071979.

9

SemanticDepth: Fusing Semantic Segmentation and Monocular Depth Estimation for Enabling Autonomous Driving in Roads without Lane Lines.

Sensors (Basel). 2019 Jul 22;19(14):3224. doi: 10.3390/s19143224.

本文引用的文献

1

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.

IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.

2

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation.

IEEE Trans Pattern Anal Mach Intell. 2017 Dec;39(12):2481-2495. doi: 10.1109/TPAMI.2016.2644615. Epub 2017 Jan 2.

3

Graph-based segmentation for RGB-D data using 3-D geometry enhanced superpixels.

IEEE Trans Cybern. 2015 May;45(5):913-26. doi: 10.1109/TCYB.2014.2340032. Epub 2014 Jul 29.

4

Watershed cuts: thinnings, shortest path forests, and topological watersheds.

IEEE Trans Pattern Anal Mach Intell. 2010 May;32(5):925-39. doi: 10.1109/TPAMI.2009.71.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

文档翻译

学术文献翻译模型，支持多种主流文档格式。