WaveNet：带知识蒸馏的小波网络用于 RGB-T 显著目标检测。

WaveNet: Wavelet Network With Knowledge Distillation for RGB-T Salient Object Detection.

出版信息

IEEE Trans Image Process. 2023;32:3027-3039. doi: 10.1109/TIP.2023.3275538. Epub 2023 May 26.

DOI:10.1109/TIP.2023.3275538

PMID:37192028

Abstract

In recent years, various neural network architectures for computer vision have been devised, such as the visual transformer and multilayer perceptron (MLP). A transformer based on an attention mechanism can outperform a traditional convolutional neural network. Compared with the convolutional neural network and transformer, the MLP introduces less inductive bias and achieves stronger generalization. In addition, a transformer shows an exponential increase in the inference, training, and debugging times. Considering a wave function representation, we propose the WaveNet architecture that adopts a novel vision task-oriented wavelet-based MLP for feature extraction to perform salient object detection in RGB (red-green-blue)-thermal infrared images. In addition, we apply knowledge distillation to a transformer as an advanced teacher network to acquire rich semantic and geometric information and guide WaveNet learning with this information. Following the shortest-path concept, we adopt the Kullback-Leibler distance as a regularization term for the RGB features to be as similar to the thermal infrared features as possible. The discrete wavelet transform allows for the examination of frequency-domain features in a local time domain and time-domain features in a local frequency domain. We apply this representation ability to perform cross-modality feature fusion. Specifically, we introduce a progressively cascaded sine-cosine module for cross-layer feature fusion and use low-level features to obtain clear boundaries of salient objects through the MLP. Results from extensive experiments indicate that the proposed WaveNet achieves impressive performance on benchmark RGB-thermal infrared datasets. The results and code are publicly available at https://github.com/nowander/WaveNet.

摘要

近年来，已经设计出了各种用于计算机视觉的神经网络架构，例如视觉转换器和多层感知机（MLP）。基于注意力机制的转换器可以胜过传统的卷积神经网络。与卷积神经网络和转换器相比，MLP 引入的归纳偏差较少，实现了更强的泛化能力。此外，转换器在推理、训练和调试时间方面呈指数级增长。考虑到波函数表示，我们提出了 WaveNet 架构，该架构采用了一种新颖的面向视觉任务的基于小波的 MLP 进行特征提取，以在 RGB（红-绿-蓝）-热红外图像中执行显著目标检测。此外，我们将知识蒸馏应用于作为高级教师网络的转换器，以获取丰富的语义和几何信息，并利用这些信息指导 WaveNet 学习。遵循最短路径概念，我们采用 Kullback-Leibler 距离作为 RGB 特征的正则化项，以使它们尽可能类似于热红外特征。离散小波变换允许在局部时域中检查频域特征，以及在局部频域中检查时域特征。我们将这种表示能力应用于进行跨模态特征融合。具体来说，我们引入了一个逐步级联的正弦余弦模块，用于跨层特征融合，并通过 MLP 利用低层次特征获得显著目标的清晰边界。广泛的实验结果表明，所提出的 WaveNet 在基准 RGB-热红外数据集上取得了令人印象深刻的性能。结果和代码可在 https://github.com/nowander/WaveNet 上获得。

相似文献

WaveNet: Wavelet Network With Knowledge Distillation for RGB-T Salient Object Detection.WaveNet：带知识蒸馏的小波网络用于 RGB-T 显著目标检测。

IEEE Trans Image Process. 2023;32:3027-3039. doi: 10.1109/TIP.2023.3275538. Epub 2023 May 26.

6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-Based Instance Representation Learning.6D-ViT：基于Transformer的实例表示学习的类别级6D物体姿态估计

IEEE Trans Image Process. 2022;31:6907-6921. doi: 10.1109/TIP.2022.3216980. Epub 2022 Nov 3.

Swin Transformer-Based Edge Guidance Network for RGB-D Salient Object Detection.基于Swin Transformer的RGB-D显著目标检测边缘引导网络

Sensors (Basel). 2023 Oct 29;23(21):8802. doi: 10.3390/s23218802.

ASIF-Net: Attention Steered Interweave Fusion Network for RGB-D Salient Object Detection.ASIF-Net：用于 RGB-D 显著目标检测的注意力导向交织融合网络。

IEEE Trans Cybern. 2021 Jan;51(1):88-100. doi: 10.1109/TCYB.2020.2969255. Epub 2020 Dec 22.

3-D Convolutional Neural Networks for RGB-D Salient Object Detection and Beyond.用于RGB-D显著目标检测及其他应用的3D卷积神经网络

IEEE Trans Neural Netw Learn Syst. 2024 Mar;35(3):4309-4323. doi: 10.1109/TNNLS.2022.3202241. Epub 2024 Feb 29.

RGB-D salient object detection via convolutional capsule network based on feature extraction and integration.基于特征提取与融合的卷积胶囊网络实现RGB-D显著目标检测

Sci Rep. 2023 Oct 17;13(1):17652. doi: 10.1038/s41598-023-44698-z.

RGB-T Salient Object Detection via Fusing Multi-level CNN Features.基于融合多级卷积神经网络特征的RGB-T显著目标检测

IEEE Trans Image Process. 2019 Dec 17. doi: 10.1109/TIP.2019.2959253.

Hierarchical Alternate Interaction Network for RGB-D Salient Object Detection.用于RGB-D显著目标检测的分层交替交互网络

IEEE Trans Image Process. 2021;30:3528-3542. doi: 10.1109/TIP.2021.3062689. Epub 2021 Mar 11.

LSNet: Lightweight Spatial Boosting Network for Detecting Salient Objects in RGB-Thermal Images.LSNet：用于在RGB-热图像中检测显著物体的轻量级空间增强网络。

IEEE Trans Image Process. 2023;32:1329-1340. doi: 10.1109/TIP.2023.3242775. Epub 2023 Feb 27.

MSEDNet: Multi-scale fusion and edge-supervised network for RGB-T salient object detection.MSEDNet：用于RGB-T显著目标检测的多尺度融合与边缘监督网络

Neural Netw. 2024 Mar;171:410-422. doi: 10.1016/j.neunet.2023.12.031. Epub 2023 Dec 19.

引用本文的文献

SDA-Net: A Spatially Optimized Dual-Stream Network with Adaptive Global Attention for Building Extraction in Multi-Modal Remote Sensing Images.SDA-Net：一种具有自适应全局注意力的空间优化双流网络，用于多模态遥感图像中的建筑物提取。

Sensors (Basel). 2025 Mar 27;25(7):2112. doi: 10.3390/s25072112.

Focusing on Cracks with Instance Normalization Wavelet Layer.使用实例归一化小波层聚焦裂纹

Sensors (Basel). 2024 Dec 29;25(1):146. doi: 10.3390/s25010146.

Wavelet-Driven Multi-Band Feature Fusion for RGB-T Salient Object Detection.用于RGB-T显著目标检测的小波驱动多波段特征融合

Sensors (Basel). 2024 Dec 20;24(24):8159. doi: 10.3390/s24248159.

Edge-guided feature fusion network for RGB-T salient object detection.用于RGB-T显著目标检测的边缘引导特征融合网络。

Front Neurorobot. 2024 Dec 17;18:1489658. doi: 10.3389/fnbot.2024.1489658. eCollection 2024.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

WaveNet：带知识蒸馏的小波网络用于 RGB-T 显著目标检测。

WaveNet: Wavelet Network With Knowledge Distillation for RGB-T Salient Object Detection.

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献