基于HRFPN和高效VMamba的无人机小目标精确检测

Accurate UAV Small Object Detection Based on HRFPN and EfficentVMamba.

作者信息

Wu Shixiao, Lu Xingyuan, Guo Chengcheng, Guo Hong

机构信息

School of Information Engineering, Wuhan Business University, Wuhan 430056, China.

Key Laboratory of Computer Vision and System, Ministry of Education, Tianjin University of Technology, Tianjin 300384, China.

出版信息

Sensors (Basel). 2024 Jul 31;24(15):4966. doi: 10.3390/s24154966.

DOI:10.3390/s24154966

PMID:39124013

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11314822/

Abstract

(1) Background: Small objects in Unmanned Aerial Vehicle (UAV) images are often scattered throughout various regions of the image, such as the corners, and may be blocked by larger objects, as well as susceptible to image noise. Moreover, due to their small size, these objects occupy a limited area in the image, resulting in a scarcity of effective features for detection. (2) Methods: To address the detection of small objects in UAV imagery, we introduce a novel algorithm called High-Resolution Feature Pyramid Network Mamba-Based YOLO (HRMamba-YOLO). This algorithm leverages the strengths of a High-Resolution Network (HRNet), EfficientVMamba, and YOLOv8, integrating a Double Spatial Pyramid Pooling (Double SPP) module, an Efficient Mamba Module (EMM), and a Fusion Mamba Module (FMM) to enhance feature extraction and capture contextual information. Additionally, a new Multi-Scale Feature Fusion Network, High-Resolution Feature Pyramid Network (HRFPN), and FMM improved feature interactions and enhanced the performance of small object detection. (3) Results: For the VisDroneDET dataset, the proposed algorithm achieved a 4.4% higher Mean Average Precision (mAP) compared to YOLOv8-m. The experimental results showed that HRMamba achieved a mAP of 37.1%, surpassing YOLOv8-m by 3.8% (Dota1.5 dataset). For the UCAS_AOD dataset and the DIOR dataset, our model had a mAP 1.5% and 0.3% higher than the YOLOv8-m model, respectively. To be fair, all the models were trained without a pre-trained model. (4) Conclusions: This study not only highlights the exceptional performance and efficiency of HRMamba-YOLO in small object detection tasks but also provides innovative solutions and valuable insights for future research.

摘要

(1) 背景：无人机（UAV）图像中的小目标通常分散在图像的各个区域，如角落，并且可能被较大的物体遮挡，同时容易受到图像噪声的影响。此外，由于其尺寸较小，这些目标在图像中占据的面积有限，导致用于检测的有效特征稀缺。(2) 方法：为了解决无人机图像中小目标的检测问题，我们引入了一种名为基于高效Mamba的高分辨率特征金字塔网络YOLO（HRMamba-YOLO）的新算法。该算法利用了高分辨率网络（HRNet）、高效VMamba和YOLOv8的优势，集成了双空间金字塔池化（Double SPP）模块、高效Mamba模块（EMM）和融合Mamba模块（FMM）来增强特征提取并捕捉上下文信息。此外，一种新的多尺度特征融合网络，即高分辨率特征金字塔网络（HRFPN）和FMM改善了特征交互并提高了小目标检测的性能。(3) 结果：对于VisDroneDET数据集，与YOLOv8-m相比，所提出的算法平均精度均值（mAP）提高了4.4%。实验结果表明，HRMamba在Dota1.5数据集上的mAP达到了37.1%，比YOLOv8-m高出3.8%。对于UCAS_AOD数据集和DIOR数据集，我们的模型mAP分别比YOLOv8-m模型高1.5%和0.3%。公平地说，所有模型均未使用预训练模型进行训练。(4) 结论：本研究不仅突出了HRMamba-YOLO在小目标检测任务中的卓越性能和效率，还为未来研究提供了创新的解决方案和有价值的见解。