• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于RGB-D场景理解的联合任务递归学习

Joint Task-Recursive Learning for RGB-D Scene Understanding.

作者信息

Zhang Zhenyu, Cui Zhen, Xu Chunyan, Jie Zequn, Li Xiang, Yang Jian

出版信息

IEEE Trans Pattern Anal Mach Intell. 2020 Oct;42(10):2608-2623. doi: 10.1109/TPAMI.2019.2926728. Epub 2019 Jul 10.

DOI:10.1109/TPAMI.2019.2926728
PMID:31295103
Abstract

RGB-D scene understanding under monocular camera is an emerging and challenging topic with many potential applications. In this paper, we propose a novel Task-Recursive Learning (TRL) framework to jointly and recurrently conduct three representative tasks therein containing depth estimation, surface normal prediction and semantic segmentation. TRL recursively refines the prediction results through a series of task-level interactions, where one-time cross-task interaction is abstracted as one network block of one time stage. In each stage, we serialize multiple tasks into a sequence and then recursively perform their interactions. To adaptively enhance counterpart patterns, we encapsulate interactions into a specific Task-Attentional Module (TAM) to mutually-boost the tasks from each other. Across stages, the historical experiences of previous states of tasks are selectively propagated into the next stages by using Feature-Selection unit (FS-Unit), which takes advantage of complementary information across tasks. The sequence of task-level interactions is also evolved along a coarse-to-fine scale space such that the required details may be refined progressively. Finally the task-abstracted sequence problem of multi-task prediction is framed into a recursive network. Extensive experiments on NYU-Depth v2 and SUN RGB-D datasets demonstrate that our method can recursively refines the results of the triple tasks and achieves state-of-the-art performance.

摘要

单目相机下的RGB-D场景理解是一个新兴且具有挑战性的课题,有着许多潜在应用。在本文中,我们提出了一种新颖的任务递归学习(TRL)框架,以联合并递归地执行其中三个具有代表性的任务,包括深度估计、表面法线预测和语义分割。TRL通过一系列任务级交互递归地优化预测结果,其中一次性跨任务交互被抽象为一个时间阶段的一个网络块。在每个阶段,我们将多个任务序列化成为一个序列,然后递归地执行它们的交互。为了自适应地增强对应模式,我们将交互封装到一个特定的任务注意力模块(TAM)中,以便相互促进各个任务。在不同阶段之间,任务先前状态的历史经验通过使用特征选择单元(FS-Unit)被选择性地传播到下一阶段,该单元利用了跨任务的互补信息。任务级交互的序列也沿着从粗到细的尺度空间进行演化,以便所需的细节可以逐步细化。最后,多任务预测的任务抽象序列问题被构建为一个递归网络。在NYU-Depth v2和SUN RGB-D数据集上进行的大量实验表明,我们的方法可以递归地优化这三个任务的结果,并实现了当前最优的性能。

相似文献

1
Joint Task-Recursive Learning for RGB-D Scene Understanding.用于RGB-D场景理解的联合任务递归学习
IEEE Trans Pattern Anal Mach Intell. 2020 Oct;42(10):2608-2623. doi: 10.1109/TPAMI.2019.2926728. Epub 2019 Jul 10.
2
Progressive Hard-Mining Network for Monocular Depth Estimation.渐进式硬挖掘网络的单目深度估计。
IEEE Trans Image Process. 2018 Aug;27(8):3691-3702. doi: 10.1109/TIP.2018.2821979.
3
Collaborative Deconvolutional Neural Networks for Joint Depth Estimation and Semantic Segmentation.用于联合深度估计和语义分割的协作式反卷积神经网络
IEEE Trans Neural Netw Learn Syst. 2018 Nov;29(11):5655-5666. doi: 10.1109/TNNLS.2017.2787781. Epub 2018 Mar 20.
4
Simultaneous Semantic Segmentation and Depth Completion with Constraint of Boundary.基于边界约束的语义分割和深度补全的同步实现
Sensors (Basel). 2020 Jan 23;20(3):635. doi: 10.3390/s20030635.
5
Joint-Confidence-Guided Multi-Task Learning for 3D Reconstruction and Understanding From Monocular Camera.
IEEE Trans Image Process. 2023;32:1120-1133. doi: 10.1109/TIP.2023.3240834. Epub 2023 Feb 13.
6
BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation.BinsFormer:重新审视用于单目深度估计的自适应 bins
IEEE Trans Image Process. 2024;33:3964-3976. doi: 10.1109/TIP.2024.3416065. Epub 2024 Jun 28.
7
Laplacian Pyramid Neural Network for Dense Continuous-Value Regression for Complex Scenes.用于复杂场景密集连续值回归的拉普拉斯金字塔神经网络。
IEEE Trans Neural Netw Learn Syst. 2021 Nov;32(11):5034-5046. doi: 10.1109/TNNLS.2020.3026669. Epub 2021 Oct 27.
8
ASK: Adaptively Selecting Key Local Features for RGB-D Scene Recognition.问:为RGB-D场景识别自适应选择关键局部特征。
IEEE Trans Image Process. 2021;30:2722-2733. doi: 10.1109/TIP.2021.3053459. Epub 2021 Feb 10.
9
DMRA: Depth-Induced Multi-Scale Recurrent Attention Network for RGB-D Saliency Detection.DMRA:用于 RGB-D 显著度检测的深度诱导多尺度递归注意网络。
IEEE Trans Image Process. 2022;31:2321-2336. doi: 10.1109/TIP.2022.3154931. Epub 2022 Mar 11.
10
CMANet: Cross-Modality Attention Network for Indoor-Scene Semantic Segmentation.CMANet:用于室内场景语义分割的跨模态注意力网络
Sensors (Basel). 2022 Nov 5;22(21):8520. doi: 10.3390/s22218520.

引用本文的文献

1
Radar-Camera Fusion Network for Depth Estimation in Structured Driving Scenes.用于结构化驾驶场景深度估计的雷达-相机融合网络
Sensors (Basel). 2023 Aug 31;23(17):7560. doi: 10.3390/s23177560.