Tong Yang, Ye Hui, Yang Jishen, Yang Xiulong
School of Computer Science, Central China Normal University, Wuhan 430079, China.
Department of Computer Science, Georgia State University, Atlanta, GA 30303, USA.
Sensors (Basel). 2025 Sep 5;25(17):5556. doi: 10.3390/s25175556.
Small object detection in UAV imagery remains challenging due to complex aerial perspectives and the presence of dense, small targets with blurred boundaries. To address these challenges, we propose ACD-DETR, an adaptive end-to-end Transformer detector tailored for UAV-based small object detection. The framework introduces three core modules: the Multi-Scale Edge-Enhanced Feature Fusion Module (MSEFM) to preserve fine-grained details; the Omni-Grained Boundary Calibrator (OG-BC) for boundary-aware semantic fusion; and the Dynamic Position Bias Attention-based Intra-scale Feature Interaction (DPB-AIFI) to enhance spatial reasoning. Furthermore, we introduce ACD-DETR-SBA+, a fusion-enhanced variant that removes OG-BC and DPB-AIFI while deploying densely connected Semantic-Boundary Aggregation (SBA) modules to intensify boundary-semantic fusion. This design sacrifices computational efficiency in exchange for higher detection precision, making it suitable for resource-rich deployment scenarios. On the VisDrone2019 dataset, ACD-DETR achieves 50.9% mAP@0.5, outperforming the RT-DETR-R18 baseline by 3.6 percentage points, while reducing parameters by 18.5%. ACD-DETR-SBA+ further improves accuracy to 52.0% mAP@0.5, demonstrating the benefit of SBA-based fusion. Extensive experiments on the VisDrone2019 and DOTA datasets demonstrate that ACD-DETR achieves a state-of-the-art trade-off between accuracy and efficiency, while ACD-DETR-SBA+ achieves further performance improvements at higher computational cost. Ablation studies and visual analyses validate the effectiveness of the proposed modules and design strategies.
由于复杂的空中视角以及存在密集、边界模糊的小目标,无人机图像中的小目标检测仍然具有挑战性。为应对这些挑战,我们提出了ACD-DETR,这是一种专为基于无人机的小目标检测量身定制的自适应端到端Transformer检测器。该框架引入了三个核心模块:用于保留细粒度细节的多尺度边缘增强特征融合模块(MSEFM);用于边界感知语义融合的全粒度边界校准器(OG-BC);以及用于增强空间推理的基于动态位置偏差注意力的尺度内特征交互(DPB-AIFI)。此外,我们引入了ACD-DETR-SBA+,这是一种融合增强变体,它在部署密集连接的语义边界聚合(SBA)模块以强化边界语义融合时,去除了OG-BC和DPB-AIFI。这种设计牺牲了计算效率以换取更高的检测精度,使其适用于资源丰富的部署场景。在VisDrone2019数据集上,ACD-DETR实现了50.9%的mAP@0.5,比RT-DETR-R18基线高出3.6个百分点,同时参数减少了18.5%。ACD-DETR-SBA+进一步将准确率提高到52.0%的mAP@0.5,证明了基于SBA融合的优势。在VisDrone2019和DOTA数据集上进行的大量实验表明,ACD-DETR在准确性和效率之间实现了最优平衡,而ACD-DETR-SBA+在更高的计算成本下实现了进一步的性能提升。消融研究和可视化分析验证了所提出模块和设计策略的有效性。