FDA-DETR：一种用于定向小目标检测的具有动态查询和自适应多任务优化的频率感知DETR。

FDA-DETR: A frequency-aware DETR with dynamic query and adaptive multi-task optimization for oriented small object detection.

作者信息

Ju Cheng, Zhao Yu, Miao Shuiqing, Li Dina, Chai Rongjun, Xie Yuansha, Yan Wenyao

机构信息

School of Data Science and Engineering, Xi'an Innovation College of Yan'an University, Xi'an, China.

Institute of Artificial Intelligence, Xi'an Innovation College of Yan'an University, Xi'an, China.

出版信息

PLoS One. 2025 Aug 29;20(8):e0330929. doi: 10.1371/journal.pone.0330929. eCollection 2025.

DOI:10.1371/journal.pone.0330929

PMID:40880405

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12396711/

Abstract

Oriented small object detection remains a challenging problem in computer vision, largely due to the weak feature representation and high computational cost of existing detection Transformer (DETR)-based detectors. To address these issues, this work presents Frequency Domain Awareness Detection Transformer (FDA-DETR), an end-to-end framework that enhances both accuracy and efficiency for oriented small object detection. The core of FDA-DETR lies in its multi-scale frequency domain enhancement, which amplifies high-frequency details crucial for small object discrimination. And by introducing a density-aware dynamic query mechanism, the model further adapts computational resource allocation to object density and orientation, improving detection in complex scenes. To balance global context and local detail, a multi-granularity attention fusion module is incorporated, while an adaptive multi-task loss based on Bayesian uncertainty enables dynamic optimization across multiple objectives. Experiments on public datasets show that FDA-DETR achieves higher detection accuracy and faster inference speed compared to existing DETR-based methods, particularly for small and densely distributed objects. These results, supported by theoretical analysis and ablation studies, highlight the effectiveness and synergy of the proposed modules. FDA-DETR thus provides a robust solution for oriented small object detection and offers new perspectives for future research on feature learning and attention mechanisms.

摘要

在计算机视觉中，有向小目标检测仍然是一个具有挑战性的问题，这主要归因于现有基于检测Transformer（DETR）的检测器的特征表示能力较弱以及计算成本较高。为了解决这些问题，本文提出了频域感知检测Transformer（FDA-DETR），这是一个端到端的框架，可提高有向小目标检测的准确性和效率。FDA-DETR的核心在于其多尺度频域增强，它放大了对小目标判别至关重要的高频细节。并且通过引入密度感知动态查询机制，该模型进一步使计算资源分配适应目标密度和方向，从而改善复杂场景中的检测效果。为了平衡全局上下文和局部细节，引入了多粒度注意力融合模块，而基于贝叶斯不确定性的自适应多任务损失能够跨多个目标进行动态优化。在公共数据集上的实验表明，与现有的基于DETR的方法相比，FDA-DETR实现了更高的检测精度和更快的推理速度，特别是对于小且密集分布的目标。这些结果得到了理论分析和消融研究的支持，突出了所提出模块的有效性和协同作用。因此，FDA-DETR为有向小目标检测提供了一个强大的解决方案，并为未来关于特征学习和注意力机制的研究提供了新的视角。