Suppr超能文献

一种由非线性脉冲神经网络和Transformer增强的显著目标检测网络。

A Salient Object Detection Network Enhanced by Nonlinear Spiking Neural Systems and Transformer.

作者信息

Li Wang, Xia Meichen, Peng Hong, Liu Zhicai, Guo Jun

机构信息

School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China.

West China Hospital, SiChuan University, Chengdu 610064, P. R. China.

出版信息

Int J Neural Syst. 2025 Jun 20:2550045. doi: 10.1142/S0129065725500455.

Abstract

Although a variety of deep learning-based methods have been introduced for Salient Object Detection (SOD) to RGB and Depth (RGB-D) images, existing approaches still encounter challenges, including inadequate cross-modal feature fusion, significant errors in saliency estimation due to noise in depth information, and limited model generalization capabilities. To tackle these challenges, this paper introduces an innovative method for RGB-D SOD, TranSNP-Net, which integrates Nonlinear Spiking Neural P (NSNP) systems with Transformer networks. TranSNP-Net effectively fuses RGB and depth features by introducing an enhanced feature fusion module (SNPFusion) and an attention mechanism. Unlike traditional methods, TranSNP-Net leverages fine-tuned Swin (shifted window transformer) as its backbone network, significantly improving the model's generalization performance. Furthermore, the proposed hierarchical feature decoder (SNP-D) notably enhances accuracy in complex scenes where depth noise is prevalent. According to the experimental findings, the mean scores for the four metrics S-measure, F-measure, E-measure and MEA on the six RGB-D benchmark datasets are 0.9328, 0.9356, 0.9558 and 0.0288. TranSNP-Net achieves superior performance compared to 14 leading methods in six RGB-D benchmark datasets.

摘要

尽管已经针对显著目标检测(SOD)向RGB和深度(RGB-D)图像引入了多种基于深度学习的方法,但现有方法仍然面临挑战,包括跨模态特征融合不足、由于深度信息中的噪声导致显著度估计出现重大误差以及模型泛化能力有限。为了解决这些挑战,本文介绍了一种用于RGB-D SOD的创新方法TranSNP-Net,它将非线性脉冲神经网络(NSNP)系统与Transformer网络集成在一起。TranSNP-Net通过引入增强特征融合模块(SNPFusion)和注意力机制有效地融合了RGB和深度特征。与传统方法不同,TranSNP-Net利用微调后的Swin(移位窗口Transformer)作为其骨干网络,显著提高了模型的泛化性能。此外,所提出的分层特征解码器(SNP-D)在深度噪声普遍存在的复杂场景中显著提高了准确率。根据实验结果,在六个RGB-D基准数据集上,S-度量、F-度量、E-度量和MEA这四个指标的平均得分分别为0.9328、0.9356、0.9558和0.0288。在六个RGB-D基准数据集中,TranSNP-Net与14种领先方法相比取得了卓越的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验