Zhu Jincan, Rong Jian, Kou Weili, Zhou Qingyang, Suo Peichun
College of Big Data and Intelligent Engineering, Southwest Forestry University, Kunming, 650224, China.
The Key Laboratory of State Forestry and Grass land Administration on Forestry Ecological Big Data, Kunming, 650224, China.
Sci Rep. 2025 Jul 30;15(1):27755. doi: 10.1038/s41598-025-12670-8.
As the application of Unmanned Aerial Vehicles(UAVs) becomes increasingly widespread, the identification of UAVs is of great significance in the field of security. The research of advanced identification technology can effectively deal with the illegal invasion of UAVs and reduce the threat to aviation safety. However, during the recognition process, the effectiveness of UAVs identification is often compromised in long-distance and complex environments, particularly in night-time scenarios, where accurately and reliably identifying UAVs remains a significant challenge. To overcome this, this paper proposes an improved algorithm named YOLOv9-CAG, which is based on data from multiple sensors. This algorithm integrates the detection capabilities of visible light, infrared, and audio signals. The improvements primarily encompass three key aspects: The RepNSCPELAN4 module at the end of the trunk has been replaced with a CAM context feature enhancement module to bolster the capability of extracting features from small target UAVs; A GAM attention mechanism has been integrated into the head network to enhance the model's focus on specific areas or features of UAVs; An enhanced AKConv dynamic convolution has been implemented at the end of the head, building upon the original RepNSCPELAN4 module to more effectively capture contour details. On the Bird-UAV visible light data set, the mAP0.50 of UAVs by the improved YOLOV9-CAG model is 92.0%, which is 10.8% higher than that of the original YOLOv9 model. In terms of infrared data set, the mAP0.50 and recall of the enhanced model on the UAVs reached 86.5% and 89.2% respectively, which were also increased by 12.4% and 11.4% compared with the original YOLOv9 model, which expanded the effectiveness of the model in the infrared scene. On the audio spectrum dataset, the enhanced model demonstrated improvements in UAV recognition compared to the original YOLOv9 model, achieving increases of 8.4% in mAP0.50 and 14.3% in recall respectively. At the same time, the enhanced model in this study also has a good recognition effect on birds, achieving mAP0.50 of 85% and 94.8% under visible light and infrared conditions respectively, which is 19.8% and 1.1% higher than the original YOLOv9 model. In validation on real-world visible-light and infrared videos, the YOLOv9-CAG model demonstrated an overall average accuracy improvement of 6.8% and 3.8%, respectively, over the original YOLOv9 model. The results show that the improved YOLOv9-CAG model has excellent performance in UAVs recognition in multiple scenarios. This work pioneers a multimodal UAVs detection framework that significantly improves identification accuracy in challenging conditions, pushing the boundaries of UAVs identification technology.
随着无人机(UAV)的应用日益广泛,无人机识别在安全领域具有重要意义。先进识别技术的研究能够有效应对无人机的非法入侵,降低对航空安全的威胁。然而,在识别过程中,无人机识别的有效性在远距离和复杂环境中常常受到影响,特别是在夜间场景中,准确可靠地识别无人机仍然是一项重大挑战。为克服这一问题,本文提出了一种名为YOLOv9-CAG的改进算法,该算法基于多传感器数据。此算法整合了可见光、红外和音频信号的检测能力。改进主要包括三个关键方面:主干末端的RepNSCPELAN4模块已被CAM上下文特征增强模块取代,以增强从小型目标无人机提取特征的能力;GAM注意力机制已集成到头部网络中,以增强模型对无人机特定区域或特征的关注;在头部末端实现了增强的AKConv动态卷积,在原始RepNSCPELAN4模块的基础上更有效地捕捉轮廓细节。在Bird-UAV可见光数据集上,改进后的YOLOV9-CAG模型对无人机的mAP0.50为92.0%,比原始YOLOv9模型高10.8%。在红外数据集方面,增强模型对无人机的mAP0.50和召回率分别达到86.5%和89.2%,与原始YOLOv9模型相比也分别提高了12.4%和11.4%,扩大了模型在红外场景中的有效性。在音频频谱数据集上,增强模型在无人机识别方面相比原始YOLOv9模型有改进,mAP0.50和召回率分别提高了8.4%和14.3%。同时,本研究中的增强模型对鸟类也有良好的识别效果,在可见光和红外条件下的mAP0.50分别为85%和94.8%,比原始YOLOv9模型分别高19.8%和1.1%。在真实世界的可见光和红外视频验证中,YOLOv9-CAG模型相比原始YOLOv9模型的整体平均准确率分别提高了6.8%和3.8%。结果表明,改进后的YOLOv9-CAG模型在多种场景下的无人机识别中具有优异性能。这项工作开创了一种多模态无人机检测框架,显著提高了在具有挑战性条件下的识别准确率,推动了无人机识别技术的发展。