Yang Na, Tian Chuanzhao, Gu Xingfa, Zhang Yanting, Li Xuewen, Zhang Feng
College of Remote Sensing and Information Engineering, North China Institute of Aerospace Engineering, Langfang 065000, China.
Collaborative Innovation Center of Aerospace Remote Sensing Information Processing and Application of Hebei Province, Langfang 065000, China.
Sensors (Basel). 2025 Sep 5;25(17):5531. doi: 10.3390/s25175531.
High-resolution remote sensing images often suffer from inadequate fusion between global and local features, leading to the loss of long-range dependencies and blurred spatial details, while also exhibiting limited adaptability to multi-scale object segmentation. To overcome these limitations, this study proposes RST-Net, a semantic segmentation network featuring a dual-branch encoder structure. The encoder integrates a ResNeXt-50-based CNN branch for extracting local spatial features and a Shunted Transformer (ST) branch for capturing global contextual information. To further enhance multi-scale representation, the multi-scale feature enhancement module (MSFEM) is embedded in the CNN branch, leveraging atrous and depthwise separable convolutions to dynamically aggregate features. Additionally, the residual dynamic feature fusion (RDFF) module is incorporated into skip connections to improve interactions between encoder and decoder features. Experiments on the Vaihingen and Potsdam datasets show that RST-Net achieves promising performance, with MIoU scores of 77.04% and 79.56%, respectively, validating its effectiveness in semantic segmentation tasks.
高分辨率遥感图像常常存在全局特征与局部特征融合不足的问题,导致长距离依赖关系丢失、空间细节模糊,同时在多尺度目标分割方面的适应性也有限。为克服这些局限性,本研究提出了RST-Net,这是一种具有双分支编码器结构的语义分割网络。该编码器集成了一个基于ResNeXt-50的卷积神经网络分支,用于提取局部空间特征,以及一个分流变压器(ST)分支,用于捕获全局上下文信息。为进一步增强多尺度表示,多尺度特征增强模块(MSFEM)被嵌入到卷积神经网络分支中,利用空洞卷积和深度可分离卷积动态聚合特征。此外,残余动态特征融合(RDFF)模块被纳入跳跃连接中,以改善编码器和解码器特征之间的交互。在Vaihingen和Potsdam数据集上的实验表明,RST-Net取得了良好的性能,平均交并比(MIoU)分数分别为77.04%和79.56%,验证了其在语义分割任务中的有效性。