Suppr超能文献

TCAINet:一种具有跨模态融合和自适应解码的RGB-T显著目标检测模型。

TCAINet an RGB T salient object detection model with cross modal fusion and adaptive decoding.

作者信息

Peng Hong, Hu Yunfei, Yu Baocai, Zhang Zhen

机构信息

Ordos Institute of Liaoning Technical University, Ordos, China.

College of Faculty of Electronic and Information Engineering, Liaoning Technical University, Huludao, 125100, Liaoning, China.

出版信息

Sci Rep. 2025 Apr 24;15(1):14266. doi: 10.1038/s41598-025-98423-z.

Abstract

In the field of deep learning-based object detection, RGB-T salient object detection (SOD) networks show significant potential for cross-modal information fusion. However, existing methods still face considerable challenges in complex scenes. Specifically, current cross-modal feature fusion approaches fail to exploit the complementary information between modalities fully, resulting in limited robustness when handling diverse inputs. Furthermore, inadequate adaptation to multi-scale features hinders accurately recognizing salient objects at different scales. Although some feature decoding strategies attempt to mitigate noise interference, they often struggle in high-noise environments and lack flexible feature weighting, further restricting fusion capabilities. To address these limitations, this paper proposes a novel salient object detection network, TCAINet. The network integrates a Channel Attention (CA) mechanism, an enhanced cross-modal fusion module (CAF), and an adaptive decoder (AAD) to improve both the depth and breadth of feature fusion. Additionally, diverse noise addition and augmentation methods are applied during data preprocessing to boost the model's robustness and adaptability. Specifically, the CA module enhances the model's feature selection ability, while the CAF and AAD modules optimize the integration and processing of multimodal information. Experimental results demonstrate that TCAINet outperforms existing methods across multiple evaluation metrics, proving its effectiveness and practicality in complex scenes. Notably, the proposed model achieves improvements of 0.653%, 1.384%, 1.019%, and 5.83% in Sm, Em, Fm, and MAE metrics, respectively, confirming its efficacy in enhancing detection accuracy and optimizing feature fusion. The code and results can be found at the following link:huyunfei0219/TCAINet.

摘要

在基于深度学习的目标检测领域,基于RGB-T的显著目标检测(SOD)网络在跨模态信息融合方面展现出巨大潜力。然而,现有方法在复杂场景中仍面临诸多挑战。具体而言,当前的跨模态特征融合方法未能充分利用模态间的互补信息,导致在处理多样输入时鲁棒性有限。此外,对多尺度特征的适应性不足阻碍了在不同尺度下准确识别显著目标。尽管一些特征解码策略试图减轻噪声干扰,但它们在高噪声环境中往往效果不佳,且缺乏灵活的特征加权,进一步限制了融合能力。为解决这些局限性,本文提出了一种新颖的显著目标检测网络TCAINet。该网络集成了通道注意力(CA)机制、增强型跨模态融合模块(CAF)和自适应解码器(AAD),以提高特征融合的深度和广度。此外,在数据预处理过程中应用了多种噪声添加和增强方法,以提升模型的鲁棒性和适应性。具体来说,CA模块增强了模型的特征选择能力,而CAF和AAD模块优化了多模态信息的整合与处理。实验结果表明,TCAINet在多个评估指标上优于现有方法,证明了其在复杂场景中的有效性和实用性。值得注意的是,所提出的模型在Sm、Em、Fm和MAE指标上分别实现了0.653%、1.384%、1.019%和5.83%的提升,证实了其在提高检测精度和优化特征融合方面的功效。代码和结果可在以下链接找到:huyunfei0219/TCAINet。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ac80/12022040/a1df0282b78e/41598_2025_98423_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验