使用递归卷积神经网络对RGB-D视频进行目标类别分割。

Object class segmentation of RGB-D video using recurrent convolutional neural networks.

作者信息

Pavel Mircea Serban, Schulz Hannes, Behnke Sven

机构信息

Universität Bonn, Computer Science Institute VI, Friedrich-Ebert-Allee 144, 53113 Bonn, Germany.

出版信息

Neural Netw. 2017 Apr;88:105-113. doi: 10.1016/j.neunet.2017.01.003. Epub 2017 Jan 30.

DOI:10.1016/j.neunet.2017.01.003

PMID:28232260

Abstract

Object class segmentation is a computer vision task which requires labeling each pixel of an image with the class of the object it belongs to. Deep convolutional neural networks (DNN) are able to learn and take advantage of local spatial correlations required for this task. They are, however, restricted by their small, fixed-sized filters, which limits their ability to learn long-range dependencies. Recurrent Neural Networks (RNN), on the other hand, do not suffer from this restriction. Their iterative interpretation allows them to model long-range dependencies by propagating activity. This property is especially useful when labeling video sequences, where both spatial and temporal long-range dependencies occur. In this work, a novel RNN architecture for object class segmentation is presented. We investigate several ways to train such a network. We evaluate our models on the challenging NYU Depth v2 dataset for object class segmentation and obtain competitive results.

摘要

目标类别分割是一项计算机视觉任务，它要求用图像中每个像素所属对象的类别对其进行标注。深度卷积神经网络（DNN）能够学习并利用该任务所需的局部空间相关性。然而，它们受到小尺寸固定滤波器的限制，这限制了它们学习长距离依赖关系的能力。另一方面，循环神经网络（RNN）则不受此限制。它们的迭代解释使它们能够通过传播活动来对长距离依赖关系进行建模。当对视频序列进行标注时，这种特性特别有用，因为视频序列中会同时出现空间和时间上的长距离依赖关系。在这项工作中，提出了一种用于目标类别分割的新型RNN架构。我们研究了几种训练这种网络的方法。我们在具有挑战性的NYU Depth v2目标类别分割数据集上评估我们的模型，并取得了有竞争力的结果。

相似文献

Object class segmentation of RGB-D video using recurrent convolutional neural networks.

Neural Netw. 2017 Apr;88:105-113. doi: 10.1016/j.neunet.2017.01.003. Epub 2017 Jan 30.

Neural Netw. 2018 Sep;105:356-370. doi: 10.1016/j.neunet.2018.05.009. Epub 2018 May 22.

Neural network approach to background modeling for video object segmentation.

IEEE Trans Neural Netw. 2007 Nov;18(6):1614-27. doi: 10.1109/TNN.2007.896861.

A convolutional neural network approach for objective video quality assessment.

IEEE Trans Neural Netw. 2006 Sep;17(5):1316-27. doi: 10.1109/TNN.2006.879766.

Robust global motion estimation oriented to video object segmentation.

IEEE Trans Image Process. 2008 Jun;17(6):958-67. doi: 10.1109/TIP.2008.921985.

A Multi-Modal, Discriminative and Spatially Invariant CNN for RGB-D Object Labeling.

IEEE Trans Pattern Anal Mach Intell. 2018 Sep;40(9):2051-2065. doi: 10.1109/TPAMI.2017.2747134. Epub 2017 Aug 30.

Deep Manifold Learning Combined With Convolutional Neural Networks for Action Recognition.

IEEE Trans Neural Netw Learn Syst. 2018 Sep;29(9):3938-3952. doi: 10.1109/TNNLS.2017.2740318. Epub 2017 Sep 15.

Neural network based temporal video segmentation.

Int J Neural Syst. 2002 Jun-Aug;12(3-4):263-9. doi: 10.1142/S0129065702001163.

Deep Attention-Based Spatially Recursive Networks for Fine-Grained Visual Recognition.

IEEE Trans Cybern. 2019 May;49(5):1791-1802. doi: 10.1109/TCYB.2018.2813971. Epub 2018 Mar 22.

Semi-supervised video segmentation using tree structured graphical models.

IEEE Trans Pattern Anal Mach Intell. 2013 Nov;35(11):2751-64. doi: 10.1109/TPAMI.2013.54.

引用本文的文献

The application of artificial intelligence in EUS.

Endosc Ultrasound. 2024 Mar-Apr;13(2):65-75. doi: 10.1097/eus.0000000000000053. Epub 2024 Apr 10.

Advances in the development of therapeutic strategies against COVID-19 and perspectives in the drug design for emerging SARS-CoV-2 variants.

Comput Struct Biotechnol J. 2022;20:824-837. doi: 10.1016/j.csbj.2022.01.026. Epub 2022 Jan 31.

Spatiotemporal Recurrent Convolutional Networks for Traffic Prediction in Transportation Networks.

Sensors (Basel). 2017 Jun 26;17(7):1501. doi: 10.3390/s17071501.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用递归卷积神经网络对RGB-D视频进行目标类别分割。

Object class segmentation of RGB-D video using recurrent convolutional neural networks.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献