• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于多帧3D目标检测的时空图增强DETR

Spatial-Temporal Graph Enhanced DETR Towards Multi-Frame 3D Object Detection.

作者信息

Zhang Yifan, Zhu Zhiyu, Hou Junhui, Wu Dapeng

出版信息

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):10614-10628. doi: 10.1109/TPAMI.2024.3443335. Epub 2024 Nov 6.

DOI:10.1109/TPAMI.2024.3443335
PMID:39141469
Abstract

The Detection Transformer (DETR) has revolutionized the design of CNN-based object detection systems, showcasing impressive performance. However, its potential in the domain of multi-frame 3D object detection remains largely unexplored. In this paper, we present STEMD, a novel end-to-end framework that enhances the DETR-like paradigm for multi-frame 3D object detection by addressing three key aspects specifically tailored for this task. First, to model the inter-object spatial interaction and complex temporal dependencies, we introduce the spatial-temporal graph attention network, which represents queries as nodes in a graph and enables effective modeling of object interactions within a social context. To solve the problem of missing hard cases in the proposed output of the encoder in the current frame, we incorporate the output of the previous frame to initialize the query input of the decoder. Finally, it poses a challenge for the network to distinguish between the positive query and other highly similar queries that are not the best match. And similar queries are insufficiently suppressed and turn into redundant prediction boxes. To address this issue, our proposed IoU regularization term encourages similar queries to be distinct during the refinement. Through extensive experiments, we demonstrate the effectiveness of our approach in handling challenging scenarios, while incurring only a minor additional computational overhead.

摘要

检测变压器(DETR)彻底改变了基于卷积神经网络(CNN)的目标检测系统的设计,展现出令人印象深刻的性能。然而,其在多帧三维目标检测领域的潜力在很大程度上仍未得到探索。在本文中,我们提出了STEMD,这是一种新颖的端到端框架,通过专门针对此任务解决三个关键方面,增强了用于多帧三维目标检测的类DETR范式。首先,为了对物体间的空间交互和复杂的时间依赖性进行建模,我们引入了时空图注意力网络,该网络将查询表示为图中的节点,并能够在社交背景下有效地对物体交互进行建模。为了解决当前帧中编码器提议输出中遗漏硬实例的问题,我们合并前一帧的输出以初始化解码器的查询输入。最后,网络要区分正查询和其他并非最佳匹配的高度相似查询存在挑战。并且相似查询未得到充分抑制,会变成冗余预测框。为解决此问题,我们提出的交并比(IoU)正则化项鼓励相似查询在细化过程中变得不同。通过大量实验,我们证明了我们的方法在处理具有挑战性的场景时的有效性,同时仅产生少量额外的计算开销。

相似文献

1
Spatial-Temporal Graph Enhanced DETR Towards Multi-Frame 3D Object Detection.用于多帧3D目标检测的时空图增强DETR
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):10614-10628. doi: 10.1109/TPAMI.2024.3443335. Epub 2024 Nov 6.
2
TransVOD: End-to-End Video Object Detection With Spatial-Temporal Transformers.TransVOD:基于时空变换的端到端视频目标检测
IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):7853-7869. doi: 10.1109/TPAMI.2022.3223955. Epub 2023 May 5.
3
Unsupervised Pre-Training for Detection Transformers.用于检测变压器的无监督预训练
IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):12772-12782. doi: 10.1109/TPAMI.2022.3216514. Epub 2023 Oct 3.
4
Graph-DETR4D: Spatio-Temporal Graph Modeling for Multi-View 3D Object Detection.Graph-DETR4D:用于多视图3D目标检测的时空图建模
IEEE Trans Image Process. 2024;33:4488-4500. doi: 10.1109/TIP.2024.3430473. Epub 2024 Aug 21.
5
NAN-DETR: noising multi-anchor makes DETR better for object detection.NAN-DETR:噪声多锚点使DETR在目标检测方面表现更优。
Front Neurorobot. 2024 Oct 14;18:1484088. doi: 10.3389/fnbot.2024.1484088. eCollection 2024.
6
CCDN-DETR: A Detection Transformer Based on Constrained Contrast Denoising for Multi-Class Synthetic Aperture Radar Object Detection.CCDN-DETR:一种基于约束对比去噪的检测Transformer,用于多类合成孔径雷达目标检测。
Sensors (Basel). 2024 Mar 11;24(6):1793. doi: 10.3390/s24061793.
7
Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning.
IEEE Trans Pattern Anal Mach Intell. 2024 Nov;46(11):7331-7347. doi: 10.1109/TPAMI.2024.3387838. Epub 2024 Oct 3.
8
AO-DETR: Anti-Overlapping DETR for X-Ray Prohibited Items Detection.AO-DETR:用于X光违禁物品检测的抗重叠DETR
IEEE Trans Neural Netw Learn Syst. 2025 Jul;36(7):12076-12090. doi: 10.1109/TNNLS.2024.3487833.
9
Simple Conditional Spatial Query Mask Deformable Detection Transformer: A Detection Approach for Multi-Style Strokes of Chinese Characters.简单条件空间查询掩码可变形检测变换器:一种针对汉字多种风格笔画的检测方法。
Sensors (Basel). 2024 Jan 31;24(3):931. doi: 10.3390/s24030931.
10
DN-DETR: Accelerate DETR Training by Introducing Query DeNoising.DN-DETR:通过引入查询去噪加速DETR训练。
IEEE Trans Pattern Anal Mach Intell. 2024 Apr;46(4):2239-2251. doi: 10.1109/TPAMI.2023.3335410. Epub 2024 Mar 6.