Suppr超能文献

DV-DETR:基于RT-DETR的改进型无人机航空小目标检测算法

DV-DETR: Improved UAV Aerial Small Target Detection Algorithm Based on RT-DETR.

作者信息

Wei Xiaolong, Yin Ling, Zhang Liangliang, Wu Fei

机构信息

School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China.

出版信息

Sensors (Basel). 2024 Nov 19;24(22):7376. doi: 10.3390/s24227376.

Abstract

For drone-based detection tasks, accurately identifying small-scale targets like people, bicycles, and pedestrians remains a key challenge. In this paper, we propose DV-DETR, an improved detection model based on the Real-Time Detection Transformer (RT-DETR), specifically optimized for small target detection in high-density scenes. To achieve this, we introduce three main enhancements: (1) ResNet18 as the backbone network to improve feature extraction and reduce model complexity; (2) the integration of recalibration attention units and deformable attention mechanisms in the neck network to enhance multi-scale feature fusion and improve localization accuracy; and (3) the use of the Focaler-IoU loss function to better handle the imbalanced distribution of target scales and focus on challenging samples. Experimental results on the VisDrone2019 dataset show that DV-DETR achieves an mAP@0.5 of 50.1%, a 1.7% improvement over the baseline model, while increasing detection speed from 75 FPS to 90 FPS, meeting real-time processing requirements. These improvements not only enhance the model's accuracy and efficiency but also provide practical significance in complex, high-density urban environments, supporting real-world applications in UAV-based surveillance and monitoring tasks.

摘要

对于基于无人机的检测任务而言,准确识别诸如人员、自行车和行人等小规模目标仍然是一项关键挑战。在本文中,我们提出了DV-DETR,这是一种基于实时检测变换器(RT-DETR)的改进检测模型,特别针对高密度场景中的小目标检测进行了优化。为实现这一目标,我们引入了三项主要改进:(1)使用ResNet18作为骨干网络,以改善特征提取并降低模型复杂度;(2)在颈部网络中集成重新校准注意力单元和可变形注意力机制,以增强多尺度特征融合并提高定位精度;(3)使用Focaler-IoU损失函数,以更好地处理目标尺度的不平衡分布并关注具有挑战性的样本。在VisDrone2019数据集上的实验结果表明,DV-DETR实现了50.1%的mAP@0.5,比基线模型提高了1.7%,同时检测速度从75帧每秒提高到90帧每秒,满足实时处理要求。这些改进不仅提高了模型的准确性和效率,而且在复杂的高密度城市环境中具有实际意义,为基于无人机的监视和监测任务的实际应用提供了支持。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9313/11598011/1b012223e98a/sensors-24-07376-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验