• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于遥感建筑物提取的多尺度引导上下文感知Transformer

Multi-Scale Guided Context-Aware Transformer for Remote Sensing Building Extraction.

作者信息

Yu Mengxuan, Li Jiepan, He Wei

机构信息

School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China.

State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China.

出版信息

Sensors (Basel). 2025 Aug 29;25(17):5356. doi: 10.3390/s25175356.

DOI:10.3390/s25175356
PMID:40942786
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12431471/
Abstract

Building extraction from high-resolution remote sensing imagery is critical for urban planning and disaster management, yet remains challenging due to significant intra-class variability in architectural styles and multi-scale distribution patterns of buildings. To address these limitations, we propose the Multi-Scale Guided Context-Aware Network (MSGCANet), a Transformer-based multi-scale guided context-aware network. Our framework integrates a Contextual Exploration Module (CEM) that synergizes asymmetric and progressive dilated convolutions to hierarchically expand receptive fields, enhancing discriminability for dense building features. We further design a Window-Guided Multi-Scale Attention Mechanism (WGMSAM) to dynamically establish cross-scale spatial dependencies through adaptive window partitioning, enabling precise fusion of local geometric details and global contextual semantics. Additionally, a cross-level Transformer decoder leverages deformable convolutions for spatially adaptive feature alignment and joint channel-spatial modeling. Experimental results show that MSGCANet achieves IoU values of 75.47%, 91.53%, and 83.10%, and F1-scores of 86.03%, 95.59%, and 90.78% on the Massachusetts, WHU, and Inria datasets, respectively, demonstrating robust performance across these datasets.

摘要

从高分辨率遥感影像中提取建筑物对于城市规划和灾害管理至关重要,但由于建筑风格存在显著的类内变异性以及建筑物的多尺度分布模式,这一任务仍然具有挑战性。为了解决这些限制,我们提出了多尺度引导上下文感知网络(MSGCANet),这是一种基于Transformer的多尺度引导上下文感知网络。我们的框架集成了一个上下文探索模块(CEM),该模块协同使用非对称和渐进式空洞卷积来分层扩展感受野,增强对密集建筑物特征的可辨别性。我们进一步设计了一个窗口引导的多尺度注意力机制(WGMSAM),通过自适应窗口划分动态建立跨尺度空间依赖性,实现局部几何细节和全局上下文语义的精确融合。此外,一个跨层Transformer解码器利用可变形卷积进行空间自适应特征对齐和联合通道-空间建模。实验结果表明,MSGCANet在马萨诸塞州、武汉大学和Inria数据集上分别实现了75.47%、91.53%和83.10%的交并比(IoU)值,以及86.03%、95.59%和90.78%的F1分数,在这些数据集上均表现出强大的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/7bdc7089b07d/sensors-25-05356-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/0c3e02bb77dc/sensors-25-05356-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/d28fb9a89f62/sensors-25-05356-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/e7b01a47d087/sensors-25-05356-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/da45782db1e4/sensors-25-05356-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/8937d9b261c7/sensors-25-05356-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/4337d0f9d8ef/sensors-25-05356-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/199f42852adc/sensors-25-05356-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/03f5e27cbb0e/sensors-25-05356-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/66cdf48c1410/sensors-25-05356-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/7bdc7089b07d/sensors-25-05356-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/0c3e02bb77dc/sensors-25-05356-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/d28fb9a89f62/sensors-25-05356-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/e7b01a47d087/sensors-25-05356-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/da45782db1e4/sensors-25-05356-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/8937d9b261c7/sensors-25-05356-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/4337d0f9d8ef/sensors-25-05356-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/199f42852adc/sensors-25-05356-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/03f5e27cbb0e/sensors-25-05356-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/66cdf48c1410/sensors-25-05356-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0b/12431471/7bdc7089b07d/sensors-25-05356-g010.jpg

相似文献

1
Multi-Scale Guided Context-Aware Transformer for Remote Sensing Building Extraction.用于遥感建筑物提取的多尺度引导上下文感知Transformer
Sensors (Basel). 2025 Aug 29;25(17):5356. doi: 10.3390/s25175356.
2
Dynamic atrous attention and dual branch context fusion for cross scale Building segmentation in high resolution remote sensing imagery.用于高分辨率遥感影像跨尺度建筑物分割的动态空洞注意力与双分支上下文融合
Sci Rep. 2025 Aug 21;15(1):30800. doi: 10.1038/s41598-025-14751-0.
3
Multi-level channel-spatial attention and light-weight scale-fusion network (MCSLF-Net): multi-level channel-spatial attention and light-weight scale-fusion transformer for 3D brain tumor segmentation.多级通道空间注意力与轻量级尺度融合网络(MCSLF-Net):用于3D脑肿瘤分割的多级通道空间注意力与轻量级尺度融合变换器
Quant Imaging Med Surg. 2025 Jul 1;15(7):6301-6325. doi: 10.21037/qims-2025-354. Epub 2025 Jun 30.
4
Building extraction from remote sensing images based on multi-scale attention gate and enhanced positional information.基于多尺度注意力门控和增强位置信息的遥感影像建筑物提取
PeerJ Comput Sci. 2025 Apr 21;11:e2826. doi: 10.7717/peerj-cs.2826. eCollection 2025.
5
DGCFNet: Dual Global Context Fusion Network for remote sensing image semantic segmentation.DGCFNet:用于遥感图像语义分割的双全局上下文融合网络
PeerJ Comput Sci. 2025 Mar 27;11:e2786. doi: 10.7717/peerj-cs.2786. eCollection 2025.
6
A novel image segmentation network with multi-scale and flow-guided attention for early screening of vaginal intraepithelial neoplasia (VAIN).一种用于阴道上皮内瘤变(VAIN)早期筛查的具有多尺度和流引导注意力的新型图像分割网络。
Med Phys. 2025 Aug;52(8):e18041. doi: 10.1002/mp.18041.
7
MFPI-Net: A Multi-Scale Feature Perception and Interaction Network for Semantic Segmentation of Urban Remote Sensing Images.MFPI-Net:一种用于城市遥感影像语义分割的多尺度特征感知与交互网络
Sensors (Basel). 2025 Jul 27;25(15):4660. doi: 10.3390/s25154660.
8
TLTNet: A novel transscale cascade layered transformer network for enhanced retinal blood vessel segmentation.TLTNet:一种新颖的跨尺度级联分层Transformer 网络,用于增强视网膜血管分割。
Comput Biol Med. 2024 Aug;178:108773. doi: 10.1016/j.compbiomed.2024.108773. Epub 2024 Jun 25.
9
SCFMUNet: A fusion architecture based on multi-scale state space model and channel attention for medical image segmentation.SCFMUNet:一种基于多尺度状态空间模型和通道注意力机制的医学图像分割融合架构。
Neural Netw. 2025 Jul 29;192:107919. doi: 10.1016/j.neunet.2025.107919.
10
MCA-GAN: A lightweight Multi-scale Context-Aware Generative Adversarial Network for MRI reconstruction.MCA-GAN:一种用于磁共振成像重建的轻量级多尺度上下文感知生成对抗网络。
Magn Reson Imaging. 2025 Aug 6;124:110465. doi: 10.1016/j.mri.2025.110465.

本文引用的文献

1
Res2Net: A New Multi-Scale Backbone Architecture.Res2Net:一种新的多尺度骨干网络架构。
IEEE Trans Pattern Anal Mach Intell. 2021 Feb;43(2):652-662. doi: 10.1109/TPAMI.2019.2938758. Epub 2021 Jan 8.
2
Coarse-to-Fine Semantic Segmentation From Image-Level Labels.从图像级标签进行粗到细的语义分割。
IEEE Trans Image Process. 2020;29:225-236. doi: 10.1109/TIP.2019.2926748. Epub 2019 Jul 12.
3
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.DeepLab:基于深度卷积网络、空洞卷积和全连接条件随机场的语义图像分割。
IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.
4
Fully Convolutional Networks for Semantic Segmentation.全卷积网络用于语义分割。
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):640-651. doi: 10.1109/TPAMI.2016.2572683. Epub 2016 May 24.