Liu Wenxin, Che Shengbing, Wang Wanqin, Du Yafei, Tuo Yangzhuo, Zhang Zixuan
College of Computer Science and Mathematics, Central South University of Forestry & Technology, Changsha, 410004, Hunan, China.
College of Electronic Information and Physics, Central South University of Forestry & Technology, Changsha, 410004, Hunan, China.
Sci Rep. 2025 Mar 19;15(1):9501. doi: 10.1038/s41598-025-93049-7.
Remote sensing images are essential in various fields, but their high-resolution (HR) acquisition is often limited by factors such as sensor resolution and high costs. To address this challenge, we propose the Multi-image Remote Sensing Super-Resolution with Enhanced Spatio-temporal Feature Interaction Fusion Network ([Formula: see text]N). This model is a deep neural network based on end-to-end. The main innovations of the [Formula: see text]N network model include the following aspects. Firstly, through the Attention-Based Feature Encoder (ABFE) module, the spatial features of low-resolution (LR) images are precisely extracted. Combined with the Channel Attention Block (CAB) module, global information guidance and weighting are provided for the input features, effectively strengthening the spatial feature extraction capability of ABFE. Secondly, in terms of temporal feature modeling, we designed the Residual Temporal Attention Block (RTAB). This module effectively weights k LR images of the same location captured at different times via a global residual temporal connection mechanism, fully exploiting their similarities and temporal dependencies, and enhancing the cross-layer information transmission. The ConvGRU-RTAB Fusion Module (CRFM) captures the temporal features using RTAB based on ABFE and fuses the spatial and temporal features. Finally, the Decoder module enlarges the resolution of the fused features to achieve high quality super resolution image reconstruction. The comparative experiment results show that our model achieves notable improvements in the cPSNR metric, with values of 49.69 dB and 51.57 dB in the NIR and RED bands of the PROBA-V dataset, respectively. The visual quality of the reconstructed images surpasses that of state-of-the-art methods, including TR-MISR and MAST etc.
遥感图像在各个领域都至关重要,但其高分辨率(HR)获取常常受到传感器分辨率和高成本等因素的限制。为应对这一挑战,我们提出了具有增强时空特征交互融合网络([公式:见原文]N)的多图像遥感超分辨率模型。该模型是一个基于端到端的深度神经网络。[公式:见原文]N网络模型的主要创新包括以下几个方面。首先,通过基于注意力的特征编码器(ABFE)模块,精确提取低分辨率(LR)图像的空间特征。结合通道注意力块(CAB)模块,为输入特征提供全局信息引导和加权,有效增强了ABFE的空间特征提取能力。其次,在时间特征建模方面,我们设计了残差时间注意力块(RTAB)。该模块通过全局残差时间连接机制有效地对在不同时间捕获的同一位置的k个LR图像进行加权,充分利用它们的相似性和时间依赖性,并增强跨层信息传输。卷积门控循环单元 - RTAB融合模块(CRFM)基于ABFE使用RTAB捕获时间特征,并融合空间和时间特征。最后,解码器模块放大融合特征的分辨率以实现高质量的超分辨率图像重建。对比实验结果表明,我们的模型在cPSNR指标上取得了显著改进,在PROBA - V数据集的近红外(NIR)和红色(RED)波段中的值分别为49.69 dB和51.57 dB。重建图像的视觉质量超过了包括TR - MISR和MAST等在内的现有方法。