Pavel Mircea Serban, Schulz Hannes, Behnke Sven
Universität Bonn, Computer Science Institute VI, Friedrich-Ebert-Allee 144, 53113 Bonn, Germany.
Neural Netw. 2017 Apr;88:105-113. doi: 10.1016/j.neunet.2017.01.003. Epub 2017 Jan 30.
Object class segmentation is a computer vision task which requires labeling each pixel of an image with the class of the object it belongs to. Deep convolutional neural networks (DNN) are able to learn and take advantage of local spatial correlations required for this task. They are, however, restricted by their small, fixed-sized filters, which limits their ability to learn long-range dependencies. Recurrent Neural Networks (RNN), on the other hand, do not suffer from this restriction. Their iterative interpretation allows them to model long-range dependencies by propagating activity. This property is especially useful when labeling video sequences, where both spatial and temporal long-range dependencies occur. In this work, a novel RNN architecture for object class segmentation is presented. We investigate several ways to train such a network. We evaluate our models on the challenging NYU Depth v2 dataset for object class segmentation and obtain competitive results.
目标类别分割是一项计算机视觉任务,它要求用图像中每个像素所属对象的类别对其进行标注。深度卷积神经网络(DNN)能够学习并利用该任务所需的局部空间相关性。然而,它们受到小尺寸固定滤波器的限制,这限制了它们学习长距离依赖关系的能力。另一方面,循环神经网络(RNN)则不受此限制。它们的迭代解释使它们能够通过传播活动来对长距离依赖关系进行建模。当对视频序列进行标注时,这种特性特别有用,因为视频序列中会同时出现空间和时间上的长距离依赖关系。在这项工作中,提出了一种用于目标类别分割的新型RNN架构。我们研究了几种训练这种网络的方法。我们在具有挑战性的NYU Depth v2目标类别分割数据集上评估我们的模型,并取得了有竞争力的结果。