• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CSANet:用于RGB-T城市场景理解的上下文空间感知网络

CSANet: Context-Spatial Awareness Network for RGB-T Urban Scene Understanding.

作者信息

Li Ruixiang, Wang Zhen, Guo Jianxin, Zhang Chuanlei

机构信息

School of Electronic Information, Xijing University, Xijing Road, Chang'an District, Xi'an 710123, China.

School of Computer Science, Northwestern Polytechnical University, Dongxiang Road, Chang'an District, Xi'an 710129, China.

出版信息

J Imaging. 2025 Jun 9;11(6):188. doi: 10.3390/jimaging11060188.

DOI:10.3390/jimaging11060188
PMID:40558787
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12194480/
Abstract

Semantic segmentation plays a critical role in understanding complex urban environments, particularly for autonomous driving applications. However, existing approaches face significant challenges under low-light and adverse weather conditions. To address these limitations, we propose CSANet (Context Spatial Awareness Network), a novel framework that effectively integrates RGB and thermal infrared (TIR) modalities. CSANet employs an efficient encoder to extract complementary local and global features, while a hierarchical fusion strategy is adopted to selectively integrate visual and semantic information. Notably, the Channel-Spatial Cross-Fusion Module (CSCFM) enhances local details by fusing multi-modal features, and the Multi-Head Fusion Module (MHFM) captures global dependencies and calibrates multi-modal information. Furthermore, the Spatial Coordinate Attention Mechanism (SCAM) improves object localization accuracy in complex urban scenes. Evaluations on benchmark datasets (MFNet and PST900) demonstrate that CSANet achieves state-of-the-art performance, significantly advancing RGB-T semantic segmentation.

摘要

语义分割在理解复杂的城市环境中起着关键作用,特别是对于自动驾驶应用而言。然而,现有方法在低光照和恶劣天气条件下面临重大挑战。为了解决这些局限性,我们提出了CSANet(上下文空间感知网络),这是一个有效集成RGB和热红外(TIR)模态的新颖框架。CSANet采用高效编码器来提取互补的局部和全局特征,同时采用分层融合策略来选择性地整合视觉和语义信息。值得注意的是,通道-空间交叉融合模块(CSCFM)通过融合多模态特征增强局部细节,多头融合模块(MHFM)捕捉全局依赖性并校准多模态信息。此外,空间坐标注意力机制(SCAM)提高了复杂城市场景中的目标定位精度。在基准数据集(MFNet和PST900)上的评估表明,CSANet实现了领先的性能,显著推动了RGB-T语义分割的发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/cd25f721407e/jimaging-11-00188-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/d15121b936ed/jimaging-11-00188-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/2b5e8fe78d69/jimaging-11-00188-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/7da0cfa9f8f8/jimaging-11-00188-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/eafa82351074/jimaging-11-00188-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/6f791fd7fe06/jimaging-11-00188-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/8c9cc42ded47/jimaging-11-00188-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/c6cf8fb717c7/jimaging-11-00188-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/593851892f49/jimaging-11-00188-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/cd25f721407e/jimaging-11-00188-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/d15121b936ed/jimaging-11-00188-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/2b5e8fe78d69/jimaging-11-00188-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/7da0cfa9f8f8/jimaging-11-00188-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/eafa82351074/jimaging-11-00188-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/6f791fd7fe06/jimaging-11-00188-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/8c9cc42ded47/jimaging-11-00188-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/c6cf8fb717c7/jimaging-11-00188-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/593851892f49/jimaging-11-00188-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3873/12194480/cd25f721407e/jimaging-11-00188-g009.jpg

相似文献

1
CSANet: Context-Spatial Awareness Network for RGB-T Urban Scene Understanding.CSANet:用于RGB-T城市场景理解的上下文空间感知网络
J Imaging. 2025 Jun 9;11(6):188. doi: 10.3390/jimaging11060188.
2
CFANet: The Cross-Modal Fusion Attention Network for Indoor RGB-D Semantic Segmentation.CFANet:用于室内RGB-D语义分割的跨模态融合注意力网络
J Imaging. 2025 May 27;11(6):177. doi: 10.3390/jimaging11060177.
3
DGCFNet: Dual Global Context Fusion Network for remote sensing image semantic segmentation.DGCFNet:用于遥感图像语义分割的双全局上下文融合网络
PeerJ Comput Sci. 2025 Mar 27;11:e2786. doi: 10.7717/peerj-cs.2786. eCollection 2025.
4
TLTNet: A novel transscale cascade layered transformer network for enhanced retinal blood vessel segmentation.TLTNet:一种新颖的跨尺度级联分层Transformer 网络,用于增强视网膜血管分割。
Comput Biol Med. 2024 Aug;178:108773. doi: 10.1016/j.compbiomed.2024.108773. Epub 2024 Jun 25.
5
Liver Semantic Segmentation Method Based on Multi-Channel Feature Extraction and Cross Fusion.基于多通道特征提取与交叉融合的肝脏语义分割方法
Bioengineering (Basel). 2025 Jun 11;12(6):636. doi: 10.3390/bioengineering12060636.
6
MACCoM: A multiple attention and convolutional cross-mixer framework for detailed 2D biomedical image segmentation.MACCoM:用于详细 2D 生物医学图像分割的多注意和卷积交叉混合器框架。
Comput Biol Med. 2024 Sep;179:108847. doi: 10.1016/j.compbiomed.2024.108847. Epub 2024 Jul 15.
7
Nonlinear Spiking Neural Systems for thermal Image Semantic Segmentation Networks.用于热图像语义分割网络的非线性脉冲神经网络
Int J Neural Syst. 2025 May 19:2550038. doi: 10.1142/S0129065725500388.
8
Mitigating Modality Discrepancies for RGB-T Semantic Segmentation.减轻RGB-T语义分割中的模态差异
IEEE Trans Neural Netw Learn Syst. 2024 Jul;35(7):9380-9394. doi: 10.1109/TNNLS.2022.3233089. Epub 2024 Jul 8.
9
A Salient Object Detection Network Enhanced by Nonlinear Spiking Neural Systems and Transformer.一种由非线性脉冲神经网络和Transformer增强的显著目标检测网络。
Int J Neural Syst. 2025 Jun 20:2550045. doi: 10.1142/S0129065725500455.
10
SODU2-NET: a novel deep learning-based approach for salient object detection utilizing U-NET.SODU2-NET:一种基于深度学习的利用U-NET进行显著目标检测的新方法。
PeerJ Comput Sci. 2025 May 19;11:e2623. doi: 10.7717/peerj-cs.2623. eCollection 2025.

本文引用的文献

1
GMNet: Graded-Feature Multilabel-Learning Network for RGB-Thermal Urban Scene Semantic Segmentation.GMNet:用于RGB-热红外城市场景语义分割的分级特征多标签学习网络
IEEE Trans Image Process. 2021;30:7790-7802. doi: 10.1109/TIP.2021.3109518. Epub 2021 Sep 14.
2
Siamese Network for RGB-D Salient Object Detection and Beyond.用于RGB-D显著目标检测及其他应用的连体网络
IEEE Trans Pattern Anal Mach Intell. 2021 Apr 16;PP. doi: 10.1109/TPAMI.2021.3073689.
3
UNet++: A Nested U-Net Architecture for Medical Image Segmentation.U-Net++:一种用于医学图像分割的嵌套U-Net架构。
Deep Learn Med Image Anal Multimodal Learn Clin Decis Support (2018). 2018 Sep;11045:3-11. doi: 10.1007/978-3-030-00889-5_1. Epub 2018 Sep 20.
4
Recurrent residual U-Net for medical image segmentation.用于医学图像分割的循环残差U-Net
J Med Imaging (Bellingham). 2019 Jan;6(1):014006. doi: 10.1117/1.JMI.6.1.014006. Epub 2019 Mar 27.
5
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.DeepLab:基于深度卷积网络、空洞卷积和全连接条件随机场的语义图像分割。
IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.
6
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation.SegNet:一种用于图像分割的深度卷积编解码器架构。
IEEE Trans Pattern Anal Mach Intell. 2017 Dec;39(12):2481-2495. doi: 10.1109/TPAMI.2016.2644615. Epub 2017 Jan 2.