Chang Jing, He Xiaohui, Song Dingjun, Li Panle, Qiao Mengjia, Cheng Xijie
School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, 450001, China.
School of Geoscience and Technology, Zhengzhou University, Zhengzhou, 450001, China.
Sci Rep. 2025 Jul 10;15(1):24938. doi: 10.1038/s41598-025-09086-9.
The information in remote sensing images often leads to incomplete building contours and suboptimal adaptability to complex building scenes. To address these issues, we propose a novel multi-scale network with dual attention mechanisms to extract clear building boundaries. The Squeeze-and-Excitation (SE) module is employed to bolster feature extraction, and the Atrous Spatial Pyramid Pooling (ASPP) module is integrated to capture multi-scale feature information. Then, in the decoding phase, channel grouping shuffle and dual attention mechanisms are synergistically integrated to exploit the interrelations and global dependencies of building features. Finally, a hybrid loss function is devised to address the class imbalance and thereby ensure more stable network training. Experimental evaluations on two high-resolution remote sensing datasets, Zimbabwe and Massachusetts, demonstrate that the proposed method markedly surpasses the performance of semantic segmentation networks such as PSPnet, U-net, and DAnet in terms of accuracy, recall, F1 score, and Mean Intersection over Union (MIoU), achieving an F1 score of up to 83.23% and an MIoU of 73.56%. This multi-scale attention network holds substantial promise for practical applications in building extraction.
遥感图像中的信息常常导致建筑物轮廓不完整,以及对复杂建筑场景的适应性欠佳。为了解决这些问题,我们提出了一种具有双重注意力机制的新型多尺度网络,以提取清晰的建筑物边界。采用挤压激励(SE)模块来加强特征提取,并集成空洞空间金字塔池化(ASPP)模块以捕获多尺度特征信息。然后,在解码阶段,协同集成通道分组混洗和双重注意力机制,以利用建筑物特征的相互关系和全局依赖性。最后,设计了一种混合损失函数来解决类别不平衡问题,从而确保网络训练更加稳定。在津巴布韦和马萨诸塞州这两个高分辨率遥感数据集上的实验评估表明,所提出的方法在准确率、召回率、F1分数和平均交并比(MIoU)方面显著超过了PSPnet、U-net和DAnet等语义分割网络的性能,F1分数高达83.23%,MIoU为73.56%。这种多尺度注意力网络在建筑物提取的实际应用中具有巨大潜力。