Suppr超能文献

一种基于自适应卷积和重构特征融合的无人机遥感图像目标检测模型AAPW-YOLO

An object detection model AAPW-YOLO for UAV remote sensing images based on adaptive convolution and reconstructed feature fusion.

作者信息

Wu Yiming, Mu Xiaofang, Shi Hong, Hou Mingxing

机构信息

School of Computer Science and Technology, Taiyuan Normal University, Taiyuan, 030000, China.

Shanxi Institute of Energy, Taiyuan, 030000, China.

出版信息

Sci Rep. 2025 May 9;15(1):16214. doi: 10.1038/s41598-025-00239-4.

Abstract

In small object detection scenarios such as UAV aerial imagery and remote sensing, the difficulties in feature extraction are primarily due to challenges such as small object size, multi-scale variations, and background interference. To overcome these challenges, this paper presents a model for detecting small objects, AAPW-YOLO, based on adaptive convolution and reconstructed feature fusion. In the AAPW-YOLO model, we improve the standard convolution and the CSP Bottleneck with 2 Convolutions (C2f) structure in the You Only Look Once v8 (YOLOv8) backbone network by using Alterable Kernel Convolution (AKConv), which improves the network's proficiency in capturing features across various scales while considerably lowering the model's parameter count. Additionally, we introduce the Attentional Scale Sequence Fusion P2 (ASFP2) structure, which enhances the feature fusion mechanism of the Attentional Scale Sequence Fusion You Only Look Once (ASF-YOLO) and incorporates a P2 detection layer. This optimizes the feature fusion mechanism in the YOLOv8 neck, enhancing the network's ability to capture both fine details and global contextual information, while additionally decreasing the model parameters. Finally, we adopt a gradient-enhancing strategy with the Wise Intersection over Union (Wise-IoU) loss function to balance the gradient contributions from anchor boxes of different qualities during training, thereby improving regression accuracy. Experimental results show that: The proposed detection model reduces the parameter count by 30% and improves mAP@0.5 by 3.6% on the VisDrone2019 dataset; On the DOTA v1.0 dataset, the parameter count is reduced by 30%, with a 2.5% improvement in mAP@0.5. The proposed model achieves high recognition accuracy while having fewer parameters, enhancing the robustness and generalization ability of the network.

摘要

在无人机航空影像和遥感等小目标检测场景中,特征提取的困难主要源于小目标尺寸、多尺度变化和背景干扰等挑战。为了克服这些挑战,本文提出了一种基于自适应卷积和重构特征融合的小目标检测模型AAPW-YOLO。在AAPW-YOLO模型中,我们通过使用可变内核卷积(AKConv)改进了You Only Look Once v8(YOLOv8)主干网络中的标准卷积和带有2个卷积的CSP瓶颈(C2f)结构,这提高了网络跨不同尺度捕获特征的能力,同时大幅降低了模型的参数数量。此外,我们引入了注意力尺度序列融合P2(ASFP2)结构,该结构增强了注意力尺度序列融合You Only Look Once(ASF-YOLO)的特征融合机制,并纳入了一个P2检测层。这优化了YOLOv8颈部的特征融合机制,增强了网络捕获精细细节和全局上下文信息的能力,同时还减少了模型参数。最后,我们采用一种带有明智交并比(Wise-IoU)损失函数的梯度增强策略,以在训练期间平衡来自不同质量锚框的梯度贡献,从而提高回归精度。实验结果表明:所提出的检测模型在VisDrone2019数据集上参数数量减少了30%,mAP@0.5提高了3.6%;在DOTA v1.0数据集上,参数数量减少了30%,mAP@0.5提高了2.5%。所提出的模型在参数较少的情况下实现了高识别精度,增强了网络的鲁棒性和泛化能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ece6/12064822/4bdf00fc0b87/41598_2025_239_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验