IEEE Trans Image Process. 2021;30:5678-5691. doi: 10.1109/TIP.2021.3087412. Epub 2021 Jun 18.
RGB-thermal salient object detection (SOD) aims to segment the common prominent regions of visible image and corresponding thermal infrared image that we call it RGBT SOD. Existing methods don't fully explore and exploit the potentials of complementarity of different modalities and multi-type cues of image contents, which play a vital role in achieving accurate results. In this paper, we propose a multi-interactive dual-decoder to mine and model the multi-type interactions for accurate RGBT SOD. In specific, we first encode two modalities into multi-level multi-modal feature representations. Then, we design a novel dual-decoder to conduct the interactions of multi-level features, two modalities and global contexts. With these interactions, our method works well in diversely challenging scenarios even in the presence of invalid modality. Finally, we carry out extensive experiments on public RGBT and RGBD SOD datasets, and the results show that the proposed method achieves the outstanding performance against state-of-the-art algorithms. The source code has been released at: https://github.com/lz118/Multi-interactive-Dual-decoder.
RGB-热显著目标检测(SOD)旨在分割可见光图像和相应的热红外图像中的常见突出区域,我们称之为 RGBT SOD。现有的方法没有充分探索和利用不同模式之间的互补性和图像内容的多种类型线索的潜力,这些线索在实现准确结果方面起着至关重要的作用。在本文中,我们提出了一种多交互双解码器,用于挖掘和建模多类型交互,以实现准确的 RGBT SOD。具体来说,我们首先将两种模式编码为多层次多模态特征表示。然后,我们设计了一种新颖的双解码器来进行多层次特征、两种模式和全局上下文的交互。通过这些交互,我们的方法即使在存在无效模式的情况下,也能在各种具有挑战性的场景中很好地工作。最后,我们在公共 RGBT 和 RGBD SOD 数据集上进行了广泛的实验,结果表明,所提出的方法在与最先进的算法相比时表现出色。源代码已在:https://github.com/lz118/Multi-interactive-Dual-decoder 上发布。