用于遥感目标检测的多尺度特征融合与带边缘信息增强的特征校准

Multi-scale feature fusion and feature calibration with edge information enhancement for remote sensing object detection.

作者信息

Yang Lihua, Gu Yi, Feng Hao

机构信息

School of Mechanical and Electronic Engineering, Jingdezhen Ceramic University, Jingdezhen, 333403, China.

出版信息

Sci Rep. 2025 May 2;15(1):15371. doi: 10.1038/s41598-025-99835-7.

DOI:10.1038/s41598-025-99835-7

PMID:40316719

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12048622/

Abstract

Vision Transformer-based detectors have achieved remarkable success in the field of object detection, but the application of these models to high-resolution remote sensing imagery faces challenges in computational costs and performance bottlenecks due to the increased computational complexity required to process high-resolution imagery, especially when capturing fine-grained edge features. Therefore, there is significant potential for performance optimization. To address these challenges, we propose an improved EMF-DETR based on RT-DERT-ResNet-18. EMF-DETR introduces a multi-scale edge-aware feature extraction network named MEFE-Net. The network improves object recognition and localization capabilities by extracting multi-scale features and enhancing edge information for targets at each scale, demonstrating exceptional performance in small object detection. To further enhance feature representation, the model introduces the CSFCN method, which adaptively adjusts contextual information and precisely calibrates spatial features, ensuring accurate alignment and optimization of features across different scales. In evaluations on the VisDrone2019 dataset, the proposed method achieved a 2.0% improvement in mAP compared to the baseline model, with increases of 1.5% and 2.6% in small (AP) and medium (AP) object detection respectively. Meanwhile, the number of parameters was reduced by 20.22%, demonstrating not only improved detection accuracy but also lower computational cost, highlighting its practical application potential in remote sensing image analysis.

摘要

基于视觉Transformer的检测器在目标检测领域取得了显著成功，但由于处理高分辨率图像所需的计算复杂度增加，尤其是在捕捉细粒度边缘特征时，将这些模型应用于高分辨率遥感图像面临着计算成本和性能瓶颈方面的挑战。因此，性能优化具有巨大潜力。为应对这些挑战，我们提出了一种基于RT-DERT-ResNet-18的改进型EMF-DETR。EMF-DETR引入了一个名为MEFE-Net的多尺度边缘感知特征提取网络。该网络通过提取多尺度特征并增强每个尺度上目标的边缘信息来提高目标识别和定位能力，在小目标检测中表现出卓越性能。为进一步增强特征表示，该模型引入了CSFCN方法，该方法自适应调整上下文信息并精确校准空间特征，确保跨不同尺度的特征准确对齐和优化。在对VisDrone2019数据集的评估中，与基线模型相比，该方法的mAP提高了2.0%，小目标（AP）和中目标（AP）检测分别提高了1.5%和2.6%。同时，参数数量减少了20.22%，这不仅证明了检测精度的提高，还表明计算成本降低，突出了其在遥感图像分析中的实际应用潜力。