用于遥感建筑物提取的多尺度引导上下文感知Transformer

Multi-Scale Guided Context-Aware Transformer for Remote Sensing Building Extraction.

作者信息

Yu Mengxuan, Li Jiepan, He Wei

机构信息

School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China.

State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China.

出版信息

Sensors (Basel). 2025 Aug 29;25(17):5356. doi: 10.3390/s25175356.

DOI:10.3390/s25175356

PMID:40942786

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12431471/

Abstract

Building extraction from high-resolution remote sensing imagery is critical for urban planning and disaster management, yet remains challenging due to significant intra-class variability in architectural styles and multi-scale distribution patterns of buildings. To address these limitations, we propose the Multi-Scale Guided Context-Aware Network (MSGCANet), a Transformer-based multi-scale guided context-aware network. Our framework integrates a Contextual Exploration Module (CEM) that synergizes asymmetric and progressive dilated convolutions to hierarchically expand receptive fields, enhancing discriminability for dense building features. We further design a Window-Guided Multi-Scale Attention Mechanism (WGMSAM) to dynamically establish cross-scale spatial dependencies through adaptive window partitioning, enabling precise fusion of local geometric details and global contextual semantics. Additionally, a cross-level Transformer decoder leverages deformable convolutions for spatially adaptive feature alignment and joint channel-spatial modeling. Experimental results show that MSGCANet achieves IoU values of 75.47%, 91.53%, and 83.10%, and F1-scores of 86.03%, 95.59%, and 90.78% on the Massachusetts, WHU, and Inria datasets, respectively, demonstrating robust performance across these datasets.

摘要

从高分辨率遥感影像中提取建筑物对于城市规划和灾害管理至关重要，但由于建筑风格存在显著的类内变异性以及建筑物的多尺度分布模式，这一任务仍然具有挑战性。为了解决这些限制，我们提出了多尺度引导上下文感知网络（MSGCANet），这是一种基于Transformer的多尺度引导上下文感知网络。我们的框架集成了一个上下文探索模块（CEM），该模块协同使用非对称和渐进式空洞卷积来分层扩展感受野，增强对密集建筑物特征的可辨别性。我们进一步设计了一个窗口引导的多尺度注意力机制（WGMSAM），通过自适应窗口划分动态建立跨尺度空间依赖性，实现局部几何细节和全局上下文语义的精确融合。此外，一个跨层Transformer解码器利用可变形卷积进行空间自适应特征对齐和联合通道-空间建模。实验结果表明，MSGCANet在马萨诸塞州、武汉大学和Inria数据集上分别实现了75.47%、91.53%和83.10%的交并比（IoU）值，以及86.03%、95.59%和90.78%的F1分数，在这些数据集上均表现出强大的性能。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于遥感建筑物提取的多尺度引导上下文感知Transformer

Multi-Scale Guided Context-Aware Transformer for Remote Sensing Building Extraction.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

用于遥感建筑物提取的多尺度引导上下文感知Transformer

Multi-Scale Guided Context-Aware Transformer for Remote Sensing Building Extraction.

作者信息

机构信息

出版信息

相似文献

本文引用的文献