Zhang Guofeng, Peng Yanfei, Li Jincheng
School of Electronic and Information Engineering, Liaoning Technical University, Huludao 125105, China.
Sensors (Basel). 2025 Apr 17;25(8):2534. doi: 10.3390/s25082534.
In unmanned aerial vehicle (UAV) aerial imagery scenarios, challenges such as small target size, compact distribution, and mutual occlusion often result in missed detections and false alarms. To address these challenges, this paper introduces YOLO-MARS, a small target recognition model that incorporates a multi-level attention residual mechanism. Firstly, an ERAC module is designed to enhance the ability to capture small targets by expanding the feature perception range, incorporating channel attention weight allocation strategies to strengthen the extraction capability for small targets and introducing a residual connection mechanism to improve gradient propagation stability. Secondly, a PD-ASPP structure is proposed, utilizing parallel paths for differentiated feature extraction and incorporating depthwise separable convolutions to reduce computational redundancy, thereby enabling the effective identification of targets at various scales under complex backgrounds. Thirdly, a multi-scale SGCS-FPN fusion architecture is proposed, adding a shallow feature guidance branch to establish cross-level semantic associations, thereby effectively addressing the issue of small target loss in deep networks. Finally, a dynamic WIoU evaluation function is implemented, constructing adaptive penalty terms based on the spatial distribution characteristics of predicted and ground-truth bounding boxes, thereby optimizing the boundary localization accuracy of densely packed small targets from the UAV viewpoint. Experiments conducted on the VisDrone2019 dataset demonstrate that the YOLO-MARS method achieves 40.9% and 23.4% in the mAP50 and mAP50:95 metrics, respectively, representing improvements of 8.1% and 4.3% in detection accuracy compared to the benchmark model YOLOv8n, thus demonstrating its advantages in UAV aerial target detection.
在无人机(UAV)航空图像场景中,诸如目标尺寸小、分布紧凑和相互遮挡等挑战常常导致漏检和误报。为应对这些挑战,本文介绍了YOLO-MARS,一种融合多级注意力残差机制的小目标识别模型。首先,设计了一个ERAC模块,通过扩大特征感知范围来增强捕获小目标的能力,纳入通道注意力权重分配策略以强化对小目标的提取能力,并引入残差连接机制来提高梯度传播稳定性。其次,提出了一种PD-ASPP结构,利用并行路径进行差异化特征提取,并纳入深度可分离卷积以减少计算冗余,从而能够在复杂背景下有效识别各种尺度的目标。第三,提出了一种多尺度SGCS-FPN融合架构,添加一个浅层特征引导分支以建立跨层语义关联,从而有效解决深度网络中小目标丢失的问题。最后,实现了一个动态WIoU评估函数,基于预测和真实边界框的空间分布特征构建自适应惩罚项,从而从无人机视角优化密集排列的小目标的边界定位精度。在VisDrone2019数据集上进行的实验表明,YOLO-MARS方法在mAP50和mAP50:95指标上分别达到了40.9%和23.4%,与基准模型YOLOv8n相比,检测精度分别提高了8.1%和4.3%,从而证明了其在无人机航空目标检测中的优势。