• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

WaveNet:带知识蒸馏的小波网络用于 RGB-T 显著目标检测。

WaveNet: Wavelet Network With Knowledge Distillation for RGB-T Salient Object Detection.

出版信息

IEEE Trans Image Process. 2023;32:3027-3039. doi: 10.1109/TIP.2023.3275538. Epub 2023 May 26.

DOI:10.1109/TIP.2023.3275538
PMID:37192028
Abstract

In recent years, various neural network architectures for computer vision have been devised, such as the visual transformer and multilayer perceptron (MLP). A transformer based on an attention mechanism can outperform a traditional convolutional neural network. Compared with the convolutional neural network and transformer, the MLP introduces less inductive bias and achieves stronger generalization. In addition, a transformer shows an exponential increase in the inference, training, and debugging times. Considering a wave function representation, we propose the WaveNet architecture that adopts a novel vision task-oriented wavelet-based MLP for feature extraction to perform salient object detection in RGB (red-green-blue)-thermal infrared images. In addition, we apply knowledge distillation to a transformer as an advanced teacher network to acquire rich semantic and geometric information and guide WaveNet learning with this information. Following the shortest-path concept, we adopt the Kullback-Leibler distance as a regularization term for the RGB features to be as similar to the thermal infrared features as possible. The discrete wavelet transform allows for the examination of frequency-domain features in a local time domain and time-domain features in a local frequency domain. We apply this representation ability to perform cross-modality feature fusion. Specifically, we introduce a progressively cascaded sine-cosine module for cross-layer feature fusion and use low-level features to obtain clear boundaries of salient objects through the MLP. Results from extensive experiments indicate that the proposed WaveNet achieves impressive performance on benchmark RGB-thermal infrared datasets. The results and code are publicly available at https://github.com/nowander/WaveNet.

摘要

近年来,已经设计出了各种用于计算机视觉的神经网络架构,例如视觉转换器和多层感知机(MLP)。基于注意力机制的转换器可以胜过传统的卷积神经网络。与卷积神经网络和转换器相比,MLP 引入的归纳偏差较少,实现了更强的泛化能力。此外,转换器在推理、训练和调试时间方面呈指数级增长。考虑到波函数表示,我们提出了 WaveNet 架构,该架构采用了一种新颖的面向视觉任务的基于小波的 MLP 进行特征提取,以在 RGB(红-绿-蓝)-热红外图像中执行显著目标检测。此外,我们将知识蒸馏应用于作为高级教师网络的转换器,以获取丰富的语义和几何信息,并利用这些信息指导 WaveNet 学习。遵循最短路径概念,我们采用 Kullback-Leibler 距离作为 RGB 特征的正则化项,以使它们尽可能类似于热红外特征。离散小波变换允许在局部时域中检查频域特征,以及在局部频域中检查时域特征。我们将这种表示能力应用于进行跨模态特征融合。具体来说,我们引入了一个逐步级联的正弦余弦模块,用于跨层特征融合,并通过 MLP 利用低层次特征获得显著目标的清晰边界。广泛的实验结果表明,所提出的 WaveNet 在基准 RGB-热红外数据集上取得了令人印象深刻的性能。结果和代码可在 https://github.com/nowander/WaveNet 上获得。

相似文献

1
WaveNet: Wavelet Network With Knowledge Distillation for RGB-T Salient Object Detection.WaveNet:带知识蒸馏的小波网络用于 RGB-T 显著目标检测。
IEEE Trans Image Process. 2023;32:3027-3039. doi: 10.1109/TIP.2023.3275538. Epub 2023 May 26.
2
6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-Based Instance Representation Learning.6D-ViT:基于Transformer的实例表示学习的类别级6D物体姿态估计
IEEE Trans Image Process. 2022;31:6907-6921. doi: 10.1109/TIP.2022.3216980. Epub 2022 Nov 3.
3
Swin Transformer-Based Edge Guidance Network for RGB-D Salient Object Detection.基于Swin Transformer的RGB-D显著目标检测边缘引导网络
Sensors (Basel). 2023 Oct 29;23(21):8802. doi: 10.3390/s23218802.
4
ASIF-Net: Attention Steered Interweave Fusion Network for RGB-D Salient Object Detection.ASIF-Net:用于 RGB-D 显著目标检测的注意力导向交织融合网络。
IEEE Trans Cybern. 2021 Jan;51(1):88-100. doi: 10.1109/TCYB.2020.2969255. Epub 2020 Dec 22.
5
3-D Convolutional Neural Networks for RGB-D Salient Object Detection and Beyond.用于RGB-D显著目标检测及其他应用的3D卷积神经网络
IEEE Trans Neural Netw Learn Syst. 2024 Mar;35(3):4309-4323. doi: 10.1109/TNNLS.2022.3202241. Epub 2024 Feb 29.
6
RGB-D salient object detection via convolutional capsule network based on feature extraction and integration.基于特征提取与融合的卷积胶囊网络实现RGB-D显著目标检测
Sci Rep. 2023 Oct 17;13(1):17652. doi: 10.1038/s41598-023-44698-z.
7
RGB-T Salient Object Detection via Fusing Multi-level CNN Features.基于融合多级卷积神经网络特征的RGB-T显著目标检测
IEEE Trans Image Process. 2019 Dec 17. doi: 10.1109/TIP.2019.2959253.
8
Hierarchical Alternate Interaction Network for RGB-D Salient Object Detection.用于RGB-D显著目标检测的分层交替交互网络
IEEE Trans Image Process. 2021;30:3528-3542. doi: 10.1109/TIP.2021.3062689. Epub 2021 Mar 11.
9
LSNet: Lightweight Spatial Boosting Network for Detecting Salient Objects in RGB-Thermal Images.LSNet:用于在RGB-热图像中检测显著物体的轻量级空间增强网络。
IEEE Trans Image Process. 2023;32:1329-1340. doi: 10.1109/TIP.2023.3242775. Epub 2023 Feb 27.
10
MSEDNet: Multi-scale fusion and edge-supervised network for RGB-T salient object detection.MSEDNet:用于RGB-T显著目标检测的多尺度融合与边缘监督网络
Neural Netw. 2024 Mar;171:410-422. doi: 10.1016/j.neunet.2023.12.031. Epub 2023 Dec 19.

引用本文的文献

1
SDA-Net: A Spatially Optimized Dual-Stream Network with Adaptive Global Attention for Building Extraction in Multi-Modal Remote Sensing Images.SDA-Net:一种具有自适应全局注意力的空间优化双流网络,用于多模态遥感图像中的建筑物提取。
Sensors (Basel). 2025 Mar 27;25(7):2112. doi: 10.3390/s25072112.
2
Focusing on Cracks with Instance Normalization Wavelet Layer.使用实例归一化小波层聚焦裂纹
Sensors (Basel). 2024 Dec 29;25(1):146. doi: 10.3390/s25010146.
3
Wavelet-Driven Multi-Band Feature Fusion for RGB-T Salient Object Detection.用于RGB-T显著目标检测的小波驱动多波段特征融合
Sensors (Basel). 2024 Dec 20;24(24):8159. doi: 10.3390/s24248159.
4
Edge-guided feature fusion network for RGB-T salient object detection.用于RGB-T显著目标检测的边缘引导特征融合网络。
Front Neurorobot. 2024 Dec 17;18:1489658. doi: 10.3389/fnbot.2024.1489658. eCollection 2024.