用于RGB-D场景理解的联合任务递归学习

Joint Task-Recursive Learning for RGB-D Scene Understanding.

作者信息

Zhang Zhenyu, Cui Zhen, Xu Chunyan, Jie Zequn, Li Xiang, Yang Jian

出版信息

IEEE Trans Pattern Anal Mach Intell. 2020 Oct;42(10):2608-2623. doi: 10.1109/TPAMI.2019.2926728. Epub 2019 Jul 10.

DOI:10.1109/TPAMI.2019.2926728

Abstract

RGB-D scene understanding under monocular camera is an emerging and challenging topic with many potential applications. In this paper, we propose a novel Task-Recursive Learning (TRL) framework to jointly and recurrently conduct three representative tasks therein containing depth estimation, surface normal prediction and semantic segmentation. TRL recursively refines the prediction results through a series of task-level interactions, where one-time cross-task interaction is abstracted as one network block of one time stage. In each stage, we serialize multiple tasks into a sequence and then recursively perform their interactions. To adaptively enhance counterpart patterns, we encapsulate interactions into a specific Task-Attentional Module (TAM) to mutually-boost the tasks from each other. Across stages, the historical experiences of previous states of tasks are selectively propagated into the next stages by using Feature-Selection unit (FS-Unit), which takes advantage of complementary information across tasks. The sequence of task-level interactions is also evolved along a coarse-to-fine scale space such that the required details may be refined progressively. Finally the task-abstracted sequence problem of multi-task prediction is framed into a recursive network. Extensive experiments on NYU-Depth v2 and SUN RGB-D datasets demonstrate that our method can recursively refines the results of the triple tasks and achieves state-of-the-art performance.

摘要

单目相机下的RGB-D场景理解是一个新兴且具有挑战性的课题，有着许多潜在应用。在本文中，我们提出了一种新颖的任务递归学习（TRL）框架，以联合并递归地执行其中三个具有代表性的任务，包括深度估计、表面法线预测和语义分割。TRL通过一系列任务级交互递归地优化预测结果，其中一次性跨任务交互被抽象为一个时间阶段的一个网络块。在每个阶段，我们将多个任务序列化成为一个序列，然后递归地执行它们的交互。为了自适应地增强对应模式，我们将交互封装到一个特定的任务注意力模块（TAM）中，以便相互促进各个任务。在不同阶段之间，任务先前状态的历史经验通过使用特征选择单元（FS-Unit）被选择性地传播到下一阶段，该单元利用了跨任务的互补信息。任务级交互的序列也沿着从粗到细的尺度空间进行演化，以便所需的细节可以逐步细化。最后，多任务预测的任务抽象序列问题被构建为一个递归网络。在NYU-Depth v2和SUN RGB-D数据集上进行的大量实验表明，我们的方法可以递归地优化这三个任务的结果，并实现了当前最优的性能。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于RGB-D场景理解的联合任务递归学习

Joint Task-Recursive Learning for RGB-D Scene Understanding.

作者信息

出版信息

相似文献

引用本文的文献

用于RGB-D场景理解的联合任务递归学习

Joint Task-Recursive Learning for RGB-D Scene Understanding.

作者信息

出版信息

相似文献

引用本文的文献