Suppr超能文献

一种基于深度语义融合的无人机图像轻量级小目标检测模型。

A lightweight small object detection model for UAV images based on deep semantic integration.

作者信息

Chao Manxin, Peng Can, Yun Lijun, Zhang Chunjie, Wang Huihua, Chen Zaiqing

机构信息

The School of Information, Yunnan Normal University, Kunming, 650500, Yunnan, China.

Engineering Research Center of Computer Vision and Intelligent Control Technology, Department of Education of Yunnan Province, Kunming, 650500, Yunnan, China.

出版信息

Sci Rep. 2025 Aug 29;15(1):31888. doi: 10.1038/s41598-025-16878-6.

Abstract

Most existing small object detection methods rely on residual blocks to process deep feature maps. However, these residual blocks, composed of multiple large-kernel convolution layers, incur high computational costs and contain redundant information, which makes it difficult to improve detection performance for small objects. To address this, we designed an improved feature pyramid network called L Feature Pyramid Network (L-FPN), which optimizes the allocation of computational resources for small object detection by reconstructing the original FPN structure. Based on L-FPN, we further proposed a small object detector named BPD-YOLO. We introduce a Dual-phase Asymptotic Feature Fusion mechanism (DAFF), where the shallow and deep semantic features extracted from the backbone network are initially fused in parallel to mitigate the semantic gap. Subsequently, the intermediate semantic layers are progressively integrated, enabling effective fusion of both shallow and deep feature representations. Additionally, we designed the Deep Spatial Pyramid Fusion module (DSPF), which generates multi-scale feature representations as an alternative to conventional residual block stacking, thereby reducing computational overhead. In the shallow feature extraction stage, DSPF focuses on semantic integration and enhances the extraction of small object features. This strategy, which adaptively selects different modules based on the resolution of the feature maps, is referred to as the Decoupled feature Extraction-semantic Integration mechanism (DEI). Finally, we conducted extensive experiments and thorough evaluations on both the VisDrone and TinyPerson datasets. The results demonstrate that, on the VisDrone dataset, compared to the baseline model YOLOv8n + p2, our BPD-YOLO model with L-FPN achieves a 2.8% improvement in mAP50 and a 1.4% increase in mAP50-95. On the TinyPerson dataset, BPD-YOLO further demonstrates its superiority in high-resolution feature extraction, effectively enhancing detection accuracy while significantly reducing computational costs.

摘要

大多数现有的小目标检测方法依靠残差块来处理深度特征图。然而,这些由多个大内核卷积层组成的残差块会带来高昂的计算成本,并且包含冗余信息,这使得提高小目标的检测性能变得困难。为了解决这个问题,我们设计了一种改进的特征金字塔网络,称为L特征金字塔网络(L-FPN),它通过重构原始的FPN结构来优化小目标检测的计算资源分配。基于L-FPN,我们进一步提出了一种小目标检测器,名为BPD-YOLO。我们引入了一种双阶段渐近特征融合机制(DAFF),其中从骨干网络提取的浅层和深层语义特征首先并行融合,以减轻语义差距。随后,中间语义层逐步整合,实现浅层和深层特征表示的有效融合。此外,我们设计了深度空间金字塔融合模块(DSPF),它生成多尺度特征表示,以替代传统的残差块堆叠,从而减少计算开销。在浅层特征提取阶段,DSPF专注于语义整合,并增强小目标特征的提取。这种基于特征图分辨率自适应选择不同模块的策略,称为解耦特征提取-语义整合机制(DEI)。最后,我们在VisDrone和TinyPerson数据集上进行了广泛的实验和全面的评估。结果表明,在VisDrone数据集上,与基线模型YOLOv8n + p2相比,我们带有L-FPN的BPD-YOLO模型在mAP50上提高了2.8%,在mAP50-95上提高了1.4%。在TinyPerson数据集上,BPD-YOLO进一步展示了其在高分辨率特征提取方面的优势,有效地提高了检测精度,同时显著降低了计算成本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2b5/12397394/b9367220623c/41598_2025_16878_Figa_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验