• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于视频-深度-文本显著目标检测的质量感知选择性融合网络

Quality-Aware Selective Fusion Network for V-D-T Salient Object Detection.

作者信息

Bao Liuxin, Zhou Xiaofei, Lu Xiankai, Sun Yaoqi, Yin Haibing, Hu Zhenghui, Zhang Jiyong, Yan Chenggang

出版信息

IEEE Trans Image Process. 2024;33:3212-3226. doi: 10.1109/TIP.2024.3393365. Epub 2024 May 6.

DOI:10.1109/TIP.2024.3393365
PMID:38687650
Abstract

Depth images and thermal images contain the spatial geometry information and surface temperature information, which can act as complementary information for the RGB modality. However, the quality of the depth and thermal images is often unreliable in some challenging scenarios, which will result in the performance degradation of the two-modal based salient object detection (SOD). Meanwhile, some researchers pay attention to the triple-modal SOD task, namely the visible-depth-thermal (VDT) SOD, where they attempt to explore the complementarity of the RGB image, the depth image, and the thermal image. However, existing triple-modal SOD methods fail to perceive the quality of depth maps and thermal images, which leads to performance degradation when dealing with scenes with low-quality depth and thermal images. Therefore, in this paper, we propose a quality-aware selective fusion network (QSF-Net) to conduct VDT salient object detection, which contains three subnets including the initial feature extraction subnet, the quality-aware region selection subnet, and the region-guided selective fusion subnet. Firstly, except for extracting features, the initial feature extraction subnet can generate a preliminary prediction map from each modality via a shrinkage pyramid architecture, which is equipped with the multi-scale fusion (MSF) module. Then, we design the weakly-supervised quality-aware region selection subnet to generate the quality-aware maps. Concretely, we first find the high-quality and low-quality regions by using the preliminary predictions, which further constitute the pseudo label that can be used to train this subnet. Finally, the region-guided selective fusion subnet purifies the initial features under the guidance of the quality-aware maps, and then fuses the triple-modal features and refines the edge details of prediction maps through the intra-modality and inter-modality attention (IIA) module and the edge refinement (ER) module, respectively. Extensive experiments are performed on VDT-2048 dataset, and the results show that our saliency model consistently outperforms 13 state-of-the-art methods with a large margin. Our code and results are available at https://github.com/Lx-Bao/QSFNet.

摘要

深度图像和热图像包含空间几何信息和表面温度信息,它们可以作为RGB模态的补充信息。然而,在一些具有挑战性的场景中,深度图像和热图像的质量往往不可靠,这将导致基于双模态的显著目标检测(SOD)性能下降。同时,一些研究人员关注三模态SOD任务,即可见光-深度-热(VDT)SOD,他们试图探索RGB图像、深度图像和热图像之间的互补性。然而,现有的三模态SOD方法无法感知深度图和热图像的质量,这导致在处理低质量深度和热图像的场景时性能下降。因此,在本文中,我们提出了一种质量感知选择性融合网络(QSF-Net)来进行VDT显著目标检测,它包含三个子网,即初始特征提取子网、质量感知区域选择子网和区域引导选择性融合子网。首先,除了提取特征外,初始特征提取子网可以通过收缩金字塔架构从每个模态生成一个初步预测图,该架构配备了多尺度融合(MSF)模块。然后,我们设计了弱监督质量感知区域选择子网来生成质量感知图。具体来说,我们首先利用初步预测找到高质量和低质量区域,这些区域进一步构成可用于训练该子网的伪标签。最后,区域引导选择性融合子网在质量感知图的引导下净化初始特征,然后融合三模态特征,并分别通过模态内和模态间注意力(IIA)模块和边缘细化(ER)模块细化预测图的边缘细节。在VDT-2048数据集上进行了大量实验,结果表明我们的显著性模型始终以较大优势优于13种最新方法。我们的代码和结果可在https://github.com/Lx-Bao/QSFNet上获取。

相似文献

1
Quality-Aware Selective Fusion Network for V-D-T Salient Object Detection.用于视频-深度-文本显著目标检测的质量感知选择性融合网络
IEEE Trans Image Process. 2024;33:3212-3226. doi: 10.1109/TIP.2024.3393365. Epub 2024 May 6.
2
Dynamic Selective Network for RGB-D Salient Object Detection.基于动态选择网络的 RGB-D 显著目标检测
IEEE Trans Image Process. 2021;30:9179-9192. doi: 10.1109/TIP.2021.3123548. Epub 2021 Nov 10.
3
MSEDNet: Multi-scale fusion and edge-supervised network for RGB-T salient object detection.MSEDNet:用于RGB-T显著目标检测的多尺度融合与边缘监督网络
Neural Netw. 2024 Mar;171:410-422. doi: 10.1016/j.neunet.2023.12.031. Epub 2023 Dec 19.
4
CDNet: Complementary Depth Network for RGB-D Salient Object Detection.CDNet:用于RGB-D显著目标检测的互补深度网络。
IEEE Trans Image Process. 2021;30:3376-3390. doi: 10.1109/TIP.2021.3060167. Epub 2021 Mar 9.
5
Global Guided Cross-Modal Cross-Scale Network for RGB-D Salient Object Detection.用于RGB-D显著目标检测的全局引导跨模态跨尺度网络
Sensors (Basel). 2023 Aug 17;23(16):7221. doi: 10.3390/s23167221.
6
SLMSF-Net: A Semantic Localization and Multi-Scale Fusion Network for RGB-D Salient Object Detection.SLMSF-Net:用于RGB-D显著目标检测的语义定位与多尺度融合网络
Sensors (Basel). 2024 Feb 8;24(4):1117. doi: 10.3390/s24041117.
7
EM-Trans: Edge-Aware Multimodal Transformer for RGB-D Salient Object Detection.EM-Trans:用于RGB-D显著目标检测的边缘感知多模态Transformer
IEEE Trans Neural Netw Learn Syst. 2025 Feb;36(2):3175-3188. doi: 10.1109/TNNLS.2024.3358858. Epub 2025 Feb 6.
8
Depth-Quality-Aware Salient Object Detection.深度质量感知显著目标检测
IEEE Trans Image Process. 2021;30:2350-2363. doi: 10.1109/TIP.2021.3052069. Epub 2021 Jan 27.
9
Swin Transformer-Based Edge Guidance Network for RGB-D Salient Object Detection.基于Swin Transformer的RGB-D显著目标检测边缘引导网络
Sensors (Basel). 2023 Oct 29;23(21):8802. doi: 10.3390/s23218802.
10
ASIF-Net: Attention Steered Interweave Fusion Network for RGB-D Salient Object Detection.ASIF-Net:用于 RGB-D 显著目标检测的注意力导向交织融合网络。
IEEE Trans Cybern. 2021 Jan;51(1):88-100. doi: 10.1109/TCYB.2020.2969255. Epub 2020 Dec 22.