Suppr超能文献

使用递归卷积神经网络对RGB-D视频进行目标类别分割。

Object class segmentation of RGB-D video using recurrent convolutional neural networks.

作者信息

Pavel Mircea Serban, Schulz Hannes, Behnke Sven

机构信息

Universität Bonn, Computer Science Institute VI, Friedrich-Ebert-Allee 144, 53113 Bonn, Germany.

出版信息

Neural Netw. 2017 Apr;88:105-113. doi: 10.1016/j.neunet.2017.01.003. Epub 2017 Jan 30.

Abstract

Object class segmentation is a computer vision task which requires labeling each pixel of an image with the class of the object it belongs to. Deep convolutional neural networks (DNN) are able to learn and take advantage of local spatial correlations required for this task. They are, however, restricted by their small, fixed-sized filters, which limits their ability to learn long-range dependencies. Recurrent Neural Networks (RNN), on the other hand, do not suffer from this restriction. Their iterative interpretation allows them to model long-range dependencies by propagating activity. This property is especially useful when labeling video sequences, where both spatial and temporal long-range dependencies occur. In this work, a novel RNN architecture for object class segmentation is presented. We investigate several ways to train such a network. We evaluate our models on the challenging NYU Depth v2 dataset for object class segmentation and obtain competitive results.

摘要

目标类别分割是一项计算机视觉任务,它要求用图像中每个像素所属对象的类别对其进行标注。深度卷积神经网络(DNN)能够学习并利用该任务所需的局部空间相关性。然而,它们受到小尺寸固定滤波器的限制,这限制了它们学习长距离依赖关系的能力。另一方面,循环神经网络(RNN)则不受此限制。它们的迭代解释使它们能够通过传播活动来对长距离依赖关系进行建模。当对视频序列进行标注时,这种特性特别有用,因为视频序列中会同时出现空间和时间上的长距离依赖关系。在这项工作中,提出了一种用于目标类别分割的新型RNN架构。我们研究了几种训练这种网络的方法。我们在具有挑战性的NYU Depth v2目标类别分割数据集上评估我们的模型,并取得了有竞争力的结果。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验