用于RGB-T显著目标检测的轻量级跨模态信息相互增强网络

Lightweight Cross-Modal Information Mutual Reinforcement Network for RGB-T Salient Object Detection.

作者信息

Lv Chengtao, Wan Bin, Zhou Xiaofei, Sun Yaoqi, Zhang Jiyong, Yan Chenggang

机构信息

School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China.

Lishui Institute, Hangzhou Dianzi University, Lishui 323000, China.

出版信息

Entropy (Basel). 2024 Jan 31;26(2):130. doi: 10.3390/e26020130.

DOI:10.3390/e26020130

PMID:38392385

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10888287/

Abstract

RGB-T salient object detection (SOD) has made significant progress in recent years. However, most existing works are based on heavy models, which are not applicable to mobile devices. Additionally, there is still room for improvement in the design of cross-modal feature fusion and cross-level feature fusion. To address these issues, we propose a lightweight cross-modal information mutual reinforcement network for RGB-T SOD. Our network consists of a lightweight encoder, the cross-modal information mutual reinforcement (CMIMR) module, and the semantic-information-guided fusion (SIGF) module. To reduce the computational cost and the number of parameters, we employ the lightweight module in both the encoder and decoder. Furthermore, to fuse the complementary information between two-modal features, we design the CMIMR module to enhance the two-modal features. This module effectively refines the two-modal features by absorbing previous-level semantic information and inter-modal complementary information. In addition, to fuse the cross-level feature and detect multiscale salient objects, we design the SIGF module, which effectively suppresses the background noisy information in low-level features and extracts multiscale information. We conduct extensive experiments on three RGB-T datasets, and our method achieves competitive performance compared to the other 15 state-of-the-art methods.

摘要

近年来，RGB-T显著目标检测（SOD）取得了重大进展。然而，大多数现有工作基于重型模型，不适用于移动设备。此外，在跨模态特征融合和跨层级特征融合的设计方面仍有改进空间。为了解决这些问题，我们提出了一种用于RGB-T SOD的轻量级跨模态信息相互增强网络。我们的网络由一个轻量级编码器、跨模态信息相互增强（CMIMR）模块和语义信息引导融合（SIGF）模块组成。为了降低计算成本和参数数量，我们在编码器和解码器中都采用了轻量级模块。此外，为了融合双模态特征之间的互补信息，我们设计了CMIMR模块来增强双模态特征。该模块通过吸收前一级语义信息和跨模态互补信息有效地细化了双模态特征。此外，为了融合跨层级特征并检测多尺度显著目标，我们设计了SIGF模块，该模块有效地抑制了低级特征中的背景噪声信息并提取了多尺度信息。我们在三个RGB-T数据集上进行了广泛的实验，与其他15种先进方法相比，我们的方法取得了具有竞争力的性能。