Xu Zhijing, Wang Xin, Huang Kan, Chen Ren
College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, China.
State Key Laboratory of Infrared Physics, Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai, 200083, China.
Sci Rep. 2025 Jul 7;15(1):24183. doi: 10.1038/s41598-025-10286-6.
Object detection in remote sensing images is a highly complex and challenging task. Remote sensing images typically suffer from issues such as small target sizes and densely distributed targets. Existing object detection algorithms often underperform in such scenarios due to their limited capability in handling fine-grained details and multi-scale objects. To address the persistent challenges in remote sensing image object detection, this study introduces a novel detection framework comprising three key innovations. First, we propose the Fine-grained Enhanced Downsampling Network (FEDNet) as the feature extraction backbone, specifically designed to preserve critical target information during downsampling through enhanced fine-grained feature representation. Second, we develop the Swin Transformer-based Progressive Aggregation Network (STPANet), which integrates Swin Transformer Blocks into the C3CST module to achieve superior multi-scale feature fusion while simultaneously capturing global contextual information and local spatial details. Finally, we incorporate the Shape-IoU loss function to optimize bounding box regression, significantly improving small target detection accuracy while maintaining computational efficiency. Experimental results demonstrate that the proposed method achieves outstanding performance on the DOTA and DIOR datasets, with mean average precision (mAP@50) scores of 69.9% and 85.5%, respectively. These results highlight its superior detection performance under low-resolution conditions.
遥感图像中的目标检测是一项极具复杂性和挑战性的任务。遥感图像通常存在诸如目标尺寸小和目标分布密集等问题。现有的目标检测算法由于在处理细粒度细节和多尺度目标方面能力有限,在这种场景下往往表现不佳。为了解决遥感图像目标检测中持续存在的挑战,本研究引入了一个包含三项关键创新的新型检测框架。首先,我们提出了细粒度增强下采样网络(FEDNet)作为特征提取主干,专门设计用于通过增强细粒度特征表示在降采样过程中保留关键目标信息。其次,我们开发了基于Swin Transformer的渐进聚合网络(STPANet),它将Swin Transformer块集成到C3CST模块中,以实现卓越的多尺度特征融合,同时捕获全局上下文信息和局部空间细节。最后,我们引入Shape-IoU损失函数来优化边界框回归,在保持计算效率的同时显著提高小目标检测精度。实验结果表明,所提出的方法在DOTA和DIOR数据集上取得了优异的性能,平均精度均值(mAP@50)分数分别为69.9%和85.5%。这些结果突出了其在低分辨率条件下的卓越检测性能。