Suppr超能文献

用于RGB-D显著目标检测的分层交替交互网络

Hierarchical Alternate Interaction Network for RGB-D Salient Object Detection.

作者信息

Li Gongyang, Liu Zhi, Chen Minyu, Bai Zhen, Lin Weisi, Ling Haibin

出版信息

IEEE Trans Image Process. 2021;30:3528-3542. doi: 10.1109/TIP.2021.3062689. Epub 2021 Mar 11.

Abstract

Existing RGB-D Salient Object Detection (SOD) methods take advantage of depth cues to improve the detection accuracy, while pay insufficient attention to the quality of depth information. In practice, a depth map is often with uneven quality and sometimes suffers from distractors, due to various factors in the acquisition procedure. In this article, to mitigate distractors in depth maps and highlight salient objects in RGB images, we propose a Hierarchical Alternate Interactions Network (HAINet) for RGB-D SOD. Specifically, HAINet consists of three key stages: feature encoding, cross-modal alternate interaction, and saliency reasoning. The main innovation in HAINet is the Hierarchical Alternate Interaction Module (HAIM), which plays a key role in the second stage for cross-modal feature interaction. HAIM first uses RGB features to filter distractors in depth features, and then the purified depth features are exploited to enhance RGB features in turn. The alternate RGB-depth-RGB interaction proceeds in a hierarchical manner, which progressively integrates local and global contexts within a single feature scale. In addition, we adopt a hybrid loss function to facilitate the training of HAINet. Extensive experiments on seven datasets demonstrate that our HAINet not only achieves competitive performance as compared with 19 relevant state-of-the-art methods, but also reaches a real-time processing speed of 43 fps on a single NVIDIA Titan X GPU. The code and results of our method are available at https://github.com/MathLee/HAINet.

摘要

现有的RGB-D显著目标检测(SOD)方法利用深度线索来提高检测精度,但对深度信息的质量关注不足。在实际中,由于采集过程中的各种因素,深度图的质量往往参差不齐,有时还会受到干扰物的影响。在本文中,为了减轻深度图中的干扰物并突出RGB图像中的显著目标,我们提出了一种用于RGB-D SOD的分层交替交互网络(HAINet)。具体来说,HAINet由三个关键阶段组成:特征编码、跨模态交替交互和显著性推理。HAINet的主要创新点是分层交替交互模块(HAIM),它在跨模态特征交互的第二阶段起着关键作用。HAIM首先使用RGB特征过滤深度特征中的干扰物,然后利用纯化后的深度特征依次增强RGB特征。RGB-深度-RGB的交替交互以分层方式进行,在单个特征尺度内逐步整合局部和全局上下文。此外,我们采用混合损失函数来促进HAINet的训练。在七个数据集上进行的大量实验表明,我们的HAINet不仅与19种相关的当前最优方法相比具有竞争力的性能,而且在单个NVIDIA Titan X GPU上达到了43帧每秒的实时处理速度。我们方法的代码和结果可在https://github.com/MathLee/HAINet上获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验