Suppr超能文献

NAN-DETR:噪声多锚点使DETR在目标检测方面表现更优。

NAN-DETR: noising multi-anchor makes DETR better for object detection.

作者信息

Huang Zixin, Tao Xuesong, Liu Xinyuan

机构信息

School of Computer Science, Beijing Institute of Technology, Beijing, China.

出版信息

Front Neurorobot. 2024 Oct 14;18:1484088. doi: 10.3389/fnbot.2024.1484088. eCollection 2024.

Abstract

Object detection plays a crucial role in robotic vision, focusing on accurately identifying and localizing objects within images. However, many existing methods encounter limitations, particularly when it comes to effectively implementing a one-to-many matching strategy. To address these challenges, we propose NAN-DETR (Noising Multi-Anchor Detection Transformer), an innovative framework based on DETR (Detection Transformer). NAN-DETR introduces three key improvements to transformer-based object detection: a decoder-based multi-anchor strategy, a centralization noising mechanism, and the integration of Complete Intersection over Union (CIoU) loss. The multi-anchor strategy leverages multiple anchors per object, significantly enhancing detection accuracy by improving the one-to-many matching process. The centralization noising mechanism mitigates conflicts among anchors by injecting controlled noise into the detection boxes, thereby increasing the robustness of the model. Additionally, CIoU loss, which incorporates both aspect ratio and spatial distance in its calculations, results in more precise bounding box predictions compared to the conventional IoU loss. Although NAN-DETR may not drastically improve real-time processing capabilities, its exceptional performance positions it as a highly reliable solution for diverse object detection scenarios.

摘要

目标检测在机器人视觉中起着至关重要的作用,专注于在图像中准确识别和定位目标。然而,许多现有方法存在局限性,特别是在有效实施一对多匹配策略方面。为应对这些挑战,我们提出了NAN-DETR(噪声多锚点检测变压器),这是一种基于DETR(检测变压器)的创新框架。NAN-DETR对基于变压器的目标检测进行了三项关键改进:基于解码器的多锚点策略、中心化噪声机制以及完全交并比(CIoU)损失的整合。多锚点策略为每个目标利用多个锚点,通过改进一对多匹配过程显著提高检测精度。中心化噪声机制通过向检测框注入可控噪声来减轻锚点之间的冲突,从而提高模型的鲁棒性。此外,CIoU损失在计算中同时纳入了宽高比和空间距离,与传统的IoU损失相比,能产生更精确的边界框预测。尽管NAN-DETR可能不会大幅提高实时处理能力,但其卓越性能使其成为各种目标检测场景的高度可靠解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e79/11513373/0a548e879fda/fnbot-18-1484088-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验