• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于跨模态语义分割的语义引导融合网络

Semantic Guidance Fusion Network for Cross-Modal Semantic Segmentation.

作者信息

Zhang Pan, Chen Ming, Gao Meng

机构信息

College of Information, Shanghai Ocean University, No. 999 Hucheng Ring Road, Shanghai 201306, China.

出版信息

Sensors (Basel). 2024 Apr 12;24(8):2473. doi: 10.3390/s24082473.

DOI:10.3390/s24082473
PMID:38676090
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11053727/
Abstract

Leveraging data from various modalities to enhance multimodal segmentation tasks is a well-regarded approach. Recently, efforts have been made to incorporate an array of modalities, including depth and thermal imaging. Nevertheless, the effective amalgamation of cross-modal interactions remains a challenge, given the unique traits each modality presents. In our current research, we introduce the semantic guidance fusion network (SGFN), which is an innovative cross-modal fusion network adept at integrating a diverse set of modalities. Particularly, the SGFN features a semantic guidance module (SGM) engineered to boost bi-modal feature extraction. It encompasses a learnable semantic guidance convolution (SGC) designed to merge intensity and gradient data from disparate modalities. Comprehensive experiments carried out on the NYU Depth V2, SUN-RGBD, Cityscapes, MFNet, and ZJU datasets underscore both the superior performance and generalization ability of the SGFN compared to the current leading models. Moreover, when tested on the DELIVER dataset, the efficiency of our bi-modal SGFN displayed a mIoU that is comparable to the hitherto leading model, CMNEXT.

摘要

利用来自各种模态的数据来增强多模态分割任务是一种备受认可的方法。最近,人们已努力纳入一系列模态,包括深度和热成像。然而,鉴于每种模态所呈现的独特特性,跨模态交互的有效融合仍然是一项挑战。在我们当前的研究中,我们引入了语义引导融合网络(SGFN),这是一种创新的跨模态融合网络,擅长整合各种不同的模态。特别地,SGFN具有一个语义引导模块(SGM),其设计目的是促进双模态特征提取。它包含一个可学习的语义引导卷积(SGC),旨在合并来自不同模态的强度和梯度数据。在纽约大学深度V2、SUN - RGBD、城市景观、MFNet和浙江大学数据集上进行的综合实验强调了SGFN与当前领先模型相比具有卓越的性能和泛化能力。此外,在DELIVER数据集上进行测试时,我们的双模态SGFN的效率显示出与迄今为止领先的模型CMNEXT相当的平均交并比(mIoU)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/66c5/11053727/31aaba8e5375/sensors-24-02473-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/66c5/11053727/d06d8b36081a/sensors-24-02473-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/66c5/11053727/1bf7bfc7ec9c/sensors-24-02473-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/66c5/11053727/b3a7d686683b/sensors-24-02473-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/66c5/11053727/963d09c87109/sensors-24-02473-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/66c5/11053727/31aaba8e5375/sensors-24-02473-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/66c5/11053727/d06d8b36081a/sensors-24-02473-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/66c5/11053727/1bf7bfc7ec9c/sensors-24-02473-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/66c5/11053727/b3a7d686683b/sensors-24-02473-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/66c5/11053727/963d09c87109/sensors-24-02473-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/66c5/11053727/31aaba8e5375/sensors-24-02473-g006.jpg

相似文献

1
Semantic Guidance Fusion Network for Cross-Modal Semantic Segmentation.用于跨模态语义分割的语义引导融合网络
Sensors (Basel). 2024 Apr 12;24(8):2473. doi: 10.3390/s24082473.
2
SwinCross: Cross-modal Swin transformer for head-and-neck tumor segmentation in PET/CT images.SwinCross:用于 PET/CT 图像中头颈部肿瘤分割的跨模态 Swin 变换器。
Med Phys. 2024 Mar;51(3):2096-2107. doi: 10.1002/mp.16703. Epub 2023 Sep 30.
3
Simple Scalable Multimodal Semantic Segmentation Model.简单可扩展的多模态语义分割模型
Sensors (Basel). 2024 Jan 22;24(2):0. doi: 10.3390/s24020699.
4
FGCN: Image-Fused Point Cloud Semantic Segmentation with Fusion Graph Convolutional Network.FGCN:基于融合图卷积网络的图像融合点云语义分割
Sensors (Basel). 2023 Oct 9;23(19):8338. doi: 10.3390/s23198338.
5
A modality-collaborative convolution and transformer hybrid network for unpaired multi-modal medical image segmentation with limited annotations.一种用于具有有限标注的未配对多模态医学图像分割的模态协作卷积与Transformer混合网络。
Med Phys. 2023 Sep;50(9):5460-5478. doi: 10.1002/mp.16338. Epub 2023 Mar 15.
6
Rethinking 1D convolution for lightweight semantic segmentation.重新思考用于轻量级语义分割的一维卷积
Front Neurorobot. 2023 Feb 9;17:1119231. doi: 10.3389/fnbot.2023.1119231. eCollection 2023.
7
Mitigating Modality Discrepancies for RGB-T Semantic Segmentation.减轻RGB-T语义分割中的模态差异
IEEE Trans Neural Netw Learn Syst. 2024 Jul;35(7):9380-9394. doi: 10.1109/TNNLS.2022.3233089. Epub 2024 Jul 8.
8
Self-Enhanced Mixed Attention Network for Three-Modal Images Few-Shot Semantic Segmentation.用于三模态图像少样本语义分割的自增强混合注意力网络
Sensors (Basel). 2023 Jul 22;23(14):6612. doi: 10.3390/s23146612.
9
GMNet: Graded-Feature Multilabel-Learning Network for RGB-Thermal Urban Scene Semantic Segmentation.GMNet:用于RGB-热红外城市场景语义分割的分级特征多标签学习网络
IEEE Trans Image Process. 2021;30:7790-7802. doi: 10.1109/TIP.2021.3109518. Epub 2021 Sep 14.
10
Modality preserving U-Net for segmentation of multimodal medical images.用于多模态医学图像分割的模态保留U型网络。
Quant Imaging Med Surg. 2023 Aug 1;13(8):5242-5257. doi: 10.21037/qims-22-1367. Epub 2023 Jun 14.

本文引用的文献

1
GMNet: Graded-Feature Multilabel-Learning Network for RGB-Thermal Urban Scene Semantic Segmentation.GMNet:用于RGB-热红外城市场景语义分割的分级特征多标签学习网络
IEEE Trans Image Process. 2021;30:7790-7802. doi: 10.1109/TIP.2021.3109518. Epub 2021 Sep 14.
2
Polarization-driven semantic segmentation via efficient attention-bridged fusion.通过高效注意力桥接融合实现极化驱动的语义分割
Opt Express. 2021 Feb 15;29(4):4802-4820. doi: 10.1364/OE.416130.
3
Spatial Information Guided Convolution for Real-Time RGBD Semantic Segmentation.
用于实时RGBD语义分割的空间信息引导卷积
IEEE Trans Image Process. 2021;30:2313-2324. doi: 10.1109/TIP.2021.3049332. Epub 2021 Jan 27.
4
NAS-FAS: Static-Dynamic Central Difference Network Search for Face Anti-Spoofing.NAS-FAS:基于静态-动态差分网络搜索的人脸活体检测
IEEE Trans Pattern Anal Mach Intell. 2021 Sep;43(9):3005-3023. doi: 10.1109/TPAMI.2020.3036338. Epub 2021 Aug 4.
5
Deep High-Resolution Representation Learning for Visual Recognition.用于视觉识别的深度高分辨率表征学习
IEEE Trans Pattern Anal Mach Intell. 2021 Oct;43(10):3349-3364. doi: 10.1109/TPAMI.2020.2983686. Epub 2021 Sep 2.
6
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.DeepLab:基于深度卷积网络、空洞卷积和全连接条件随机场的语义图像分割。
IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.