Suppr超能文献

用于RGB-D显著目标检测及其他应用的连体网络

Siamese Network for RGB-D Salient Object Detection and Beyond.

作者信息

Fu Keren, Fan Deng-Ping, Ji Ge-Peng, Zhao Qijun, Shen Jianbing, Zhu Ce

出版信息

IEEE Trans Pattern Anal Mach Intell. 2021 Apr 16;PP. doi: 10.1109/TPAMI.2021.3073689.

Abstract

Existing RGB-D salient object detection (SOD) models usually treat RGB and depth as independent information and design separate networks for feature extraction from each. Such schemes can easily be constrained by a limited amount of training data or over-reliance on an elaborately designed training process. Inspired by the observation that RGB and depth modalities actually present certain commonality in distinguishing salient objects, a novel joint learning and densely cooperative fusion (JL-DCF) architecture is designed to learn from both RGB and depth inputs through a shared network backbone, known as the Siamese architecture. In this paper, we propose two effective components: joint learning (JL), and densely cooperative fusion (DCF). The JL module provides robust saliency feature learning by exploiting cross-modal commonality via a Siamese network, while the DCF module is introduced for complementary feature discovery. Comprehensive experiments using 5 popular metrics show that the designed framework yields a robust RGB-D saliency detector with good generalization. As a result, JL-DCF significantly advances the SOTAs by an average of ~2.0% (F-measure) across 7 challenging datasets. In addition, we show that JL-DCF is readily applicable to other related multi-modal detection tasks, including RGB-T SOD and video SOD, achieving comparable or better performance.

摘要

现有的RGB-D显著目标检测(SOD)模型通常将RGB和深度视为独立信息,并设计单独的网络从各自中提取特征。这样的方案很容易受到有限训练数据量的限制,或者过度依赖精心设计的训练过程。受RGB和深度模态在区分显著目标时实际上存在一定共性这一观察结果的启发,设计了一种新颖的联合学习与密集协作融合(JL-DCF)架构,通过一个共享的网络主干(即暹罗架构)从RGB和深度输入中进行学习。在本文中,我们提出了两个有效组件:联合学习(JL)和密集协作融合(DCF)。JL模块通过暹罗网络利用跨模态共性来提供强大的显著特征学习,而DCF模块则用于互补特征发现。使用5种流行指标进行的综合实验表明,所设计的框架产生了一个具有良好泛化能力的强大RGB-D显著检测器。结果,JL-DCF在7个具有挑战性的数据集上平均将当前最优方法(SOTAs)显著提高了约2.0%(F值)。此外,我们表明JL-DCF很容易应用于其他相关的多模态检测任务,包括RGB-T SOD和视频SOD,实现了可比或更好的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验