ASG-YOLOv5：基于注意力和空间门控的改进型 YOLOv5 无人机遥感航空图像场景的小目标检测

School of Information Science and Engineering, Xinjiang University, Urumqi, China.

PLoS One. 2024 Jun 3;19(6):e0298698. doi: 10.1371/journal.pone.0298698. eCollection 2024.

With the accelerated development of the technological power of society, aerial images of drones gradually penetrated various industries. Due to the variable speed of drones, the captured images are shadowed, blurred, and obscured. Second, drones fly at varying altitudes, leading to changing target scales and making it difficult to detect and identify small targets. In order to solve the above problems, an improved ASG-YOLOv5 model is proposed in this paper. Firstly, this research proposes a dynamic contextual attention module, which uses feature scores to dynamically assign feature weights and output feature information through channel dimensions to improve the model's attention to small target feature information and increase the network's ability to extract contextual information; secondly, this research designs a spatial gating filtering multi-directional weighted fusion module, which uses spatial filtering and weighted bidirectional fusion in the multi-scale fusion stage to improve the characterization of weak targets, reduce the interference of redundant information, and better adapt to the detection of weak targets in images under unmanned aerial vehicle remote sensing aerial photography; meanwhile, using Normalized Wasserstein Distance and CIoU regression loss function, the similarity metric value of the regression frame is obtained by modeling the Gaussian distribution of the regression frame, which increases the smoothing of the positional difference of the small targets and solves the problem that the positional deviation of the small targets is very sensitive, so that the model's detection accuracy of the small targets is effectively improved. This paper trains and tests the model on the VisDrone2021 and AI-TOD datasets. This study used the NWPU-RESISC dataset for visual detection validation. The experimental results show that ASG-YOLOv5 has a better detection effect in unmanned aerial vehicle remote sensing aerial images, and the frames per second (FPS) reaches 86, which meets the requirement of real-time small target detection, and it can be better adapted to the detection of the weak and small targets in the aerial image dataset, and ASG-YOLOv5 outperforms many existing target detection methods, and its detection accuracy reaches 21.1% mAP value. The mAP values are improved by 2.9% and 1.4%, respectively, compared with the YOLOv5 model. The project is available at https://github.com/woaini-shw/asg-yolov5.git.

随着社会科技实力的飞速发展，无人机航拍图像逐渐渗透到各个行业。由于无人机的变速，拍摄的图像会有阴影、模糊和遮挡。其次，无人机在不同的高度飞行，导致目标尺度不断变化，难以检测和识别小目标。为了解决上述问题，本文提出了一种改进的 ASG-YOLOv5 模型。首先，本研究提出了一种动态上下文注意力模块，该模块使用特征得分动态分配特征权重，并通过通道维度输出特征信息，从而提高模型对小目标特征信息的关注程度，增强网络提取上下文信息的能力；其次，本研究设计了一个空间门控滤波多方向加权融合模块，该模块在多尺度融合阶段使用空间滤波和双向加权融合，提高了弱目标的特征表示能力，减少了冗余信息的干扰，更好地适应了无人机遥感航拍图像中弱目标的检测；同时，使用归一化 Wasserstein 距离和 CIoU 回归损失函数，通过对回归框的高斯分布进行建模，得到回归框的相似度量值，增加了小目标位置差异的平滑度，解决了小目标位置偏差非常敏感的问题，从而有效提高了模型对小目标的检测精度。本文在 VisDrone2021 和 AI-TOD 数据集上对模型进行了训练和测试。本文还使用 NWPU-RESISC 数据集进行了视觉检测验证。实验结果表明，ASG-YOLOv5 在无人机遥感航拍图像中具有更好的检测效果，帧率（FPS）达到 86，满足实时小目标检测的要求，能够更好地适应航拍图像数据集中小而弱目标的检测，ASG-YOLOv5 优于许多现有的目标检测方法，其检测精度达到 21.1% mAP 值。与 YOLOv5 模型相比，mAP 值分别提高了 2.9%和 1.4%。该项目可在 https://github.com/woaini-shw/asg-yolov5.git 上获取。