Suppr超能文献

桑巴:基于状态空间模型的遥感图像语义分割

Samba: Semantic segmentation of remotely sensed images with state space model.

作者信息

Zhu Qinfeng, Cai Yuanzhi, Fang Yuan, Yang Yihan, Chen Cheng, Fan Lei, Nguyen Anh

机构信息

Department of Civil Engineering, Xi'an Jiaotong-Liverpool University, Suzhou, 215123, China.

Department of Computer Science, University of Liverpool, Liverpool, L69 3BX, UK.

出版信息

Heliyon. 2024 Sep 26;10(19):e38495. doi: 10.1016/j.heliyon.2024.e38495. eCollection 2024 Oct 15.

Abstract

High-resolution remotely sensed images pose challenges to traditional semantic segmentation networks, such as Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). CNN-based methods struggle to handle high-resolution images due to their limited receptive field, while ViT-based methods, despite having a global receptive field, face challenges when processing long sequences. Inspired by the Mamba network, which is based on a state space model (SSM) to efficiently capture global semantic information, we propose a semantic segmentation framework for high-resolution remotely sensed imagery, named Samba. Samba utilizes an encoder-decoder architecture, with multiple Samba blocks serving as the encoder to efficiently extract multi-level semantic information, and UperNet functioning as the decoder. We evaluate Samba on the LoveDA, ISPRS Vaihingen, and ISPRS Potsdam datasets using the mIoU and mF1 metrics, and compare it with top-performing CNN-based and ViT-based methods. The results demonstrate that Samba achieves unparalleled performance on commonly used remotely sensed datasets for semantic segmentation. Samba is the first to demonstrate the effectiveness of SSM in segmenting remotely sensed imagery, setting a new performance benchmark for Mamba-based techniques in this domain of semantic segmentation. The source code and baseline implementations are available at https://github.com/zhuqinfeng1999/Samba.

摘要

高分辨率遥感图像给传统语义分割网络带来了挑战,如卷积神经网络(CNN)和视觉Transformer(ViT)。基于CNN的方法由于其有限的感受野而难以处理高分辨率图像,而基于ViT的方法尽管具有全局感受野,但在处理长序列时面临挑战。受基于状态空间模型(SSM)以有效捕获全局语义信息的曼巴网络启发,我们提出了一种用于高分辨率遥感图像的语义分割框架,名为桑巴(Samba)。桑巴采用编码器-解码器架构,多个桑巴块作为编码器以有效提取多级语义信息,UperNet作为解码器。我们使用平均交并比(mIoU)和平均F1分数(mF1)指标在LoveDA、ISPRS维辛根和ISPRS波茨坦数据集上对桑巴进行评估,并将其与性能最佳的基于CNN和基于ViT的方法进行比较。结果表明,桑巴在常用的遥感语义分割数据集上实现了无与伦比的性能。桑巴首次证明了状态空间模型在遥感图像分割中的有效性,为该语义分割领域基于曼巴的技术设定了新的性能基准。源代码和基线实现可在https://github.com/zhuqinfeng1999/Samba获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验