基于连体网络的3D单目标跟踪再探：一种通用的Transformer方法

Revisiting Siamese-Based 3D Single Object Tracking With a Versatile Transformer.

作者信息

Liu Jiaming, Wu Yue, Miao Qiguang, Gong Maoguo, Kong Linghe

出版信息

IEEE Trans Pattern Anal Mach Intell. 2025 Sep;47(9):8148-8164. doi: 10.1109/TPAMI.2025.3581381.

DOI:10.1109/TPAMI.2025.3581381

Abstract

3D Single Object Tracking (SOT) plays an important role in real-world visual applications such as autonomous driving and planning. How to realize effective 3D SOT is still a valuable challenge due to its carrier-sparse point clouds and its role-complex influencing factors. Inspired by the remote modeling of popular transformers, we further propose a Versatile Point Tracking Transformer (VPTT) method for 3D SOT, with object guidance from the template point cloud to the search area point cloud under the siamese-based tracking paradigm. Specifically, VPTT employs self- and cross- attention mechanisms and extends four matching operations, resulting in leveraging the contextual information of consecutive frames to improve the tracking results. By constructing a deep network VerFormer consisting of four successive transformer layers, which performs matching operations involving fusional transformation, separative discrimination, intersectional interaction, and unidirectional propagation from shallow to deep. Considering that the tracking task involves multiple processes, VPTT further learns how to forecast intermediate outputs including mask probability, trailing distance, and heading angle at each stage. Such a specialized design allows our VPTT to revisit the end-to-end training paradigm used for 3D tracking while developing a versatile transformer that is a perfect fit for the 3D SOT task. Experiments on three benchmarks, KITTI, nuScenes, and Waymo, show that VPTT achieves state-of-the-art tracking performance on siamese-based tracking running at $\sim$∼62 FPS.

摘要

三维单目标跟踪（SOT）在自动驾驶和规划等实际视觉应用中发挥着重要作用。由于其载体稀疏点云以及复杂的影响因素，如何实现有效的三维SOT仍然是一个具有挑战性的问题。受流行的变压器远程建模启发，我们进一步提出了一种用于三维SOT的通用点跟踪变压器（VPTT）方法，在基于暹罗的跟踪范式下，从模板点云到搜索区域点云进行目标引导。具体而言，VPTT采用自注意力和交叉注意力机制，并扩展了四种匹配操作，从而利用连续帧的上下文信息来改善跟踪结果。通过构建一个由四个连续变压器层组成的深度网络VerFormer，该网络执行涉及融合变换、分离判别、交叉交互和从浅到深的单向传播的匹配操作。考虑到跟踪任务涉及多个过程，VPTT进一步学习如何预测每个阶段的中间输出，包括掩码概率、跟踪距离和航向角。这种专门的设计使我们的VPTT能够重新审视用于三维跟踪的端到端训练范式，同时开发一种非常适合三维SOT任务的通用变压器。在KITTI、nuScenes和Waymo三个基准上的实验表明，VPTT在基于暹罗的跟踪中以约62帧每秒的速度实现了领先的跟踪性能。

相似文献

Revisiting Siamese-Based 3D Single Object Tracking With a Versatile Transformer.基于连体网络的3D单目标跟踪再探：一种通用的Transformer方法

IEEE Trans Pattern Anal Mach Intell. 2025 Sep;47(9):8148-8164. doi: 10.1109/TPAMI.2025.3581381.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Short-Term Memory Impairment短期记忆障碍

Long-term care plan recommendation for older adults with disabilities: a bipartite graph transformer and self-supervised approach.针对残疾老年人的长期护理计划建议：一种二分图变压器和自监督方法。

J Am Med Inform Assoc. 2025 Apr 1;32(4):689-701. doi: 10.1093/jamia/ocae327.

Sexual Harassment and Prevention Training性骚扰与预防培训

Exploring Dynamic Transformer for Efficient Object Tracking.

IEEE Trans Neural Netw Learn Syst. 2025 Aug;36(8):15502-15514. doi: 10.1109/TNNLS.2025.3545752.

Small Object Tracking in LiDAR Point Clouds: Learning the Target-Awareness Prototype and Fine-Grained Search Region.激光雷达点云中的小目标跟踪：学习目标感知原型和细粒度搜索区域。

Sensors (Basel). 2025 Jun 10;25(12):3633. doi: 10.3390/s25123633.

Computer and mobile technology interventions for self-management in chronic obstructive pulmonary disease.用于慢性阻塞性肺疾病自我管理的计算机和移动技术干预措施。

Cochrane Database Syst Rev. 2017 May 23;5(5):CD011425. doi: 10.1002/14651858.CD011425.pub2.

Point-cloud segmentation with in-silico data augmentation for prostate cancer treatment.用于前列腺癌治疗的基于计算机模拟数据增强的点云分割

Med Phys. 2025 Apr 3. doi: 10.1002/mp.17815.

Factors that impact on the use of mechanical ventilation weaning protocols in critically ill adults and children: a qualitative evidence-synthesis.影响重症成人和儿童机械通气撤机方案使用的因素：一项定性证据综合分析

Cochrane Database Syst Rev. 2016 Oct 4;10(10):CD011812. doi: 10.1002/14651858.CD011812.pub2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于连体网络的3D单目标跟踪再探：一种通用的Transformer方法

Revisiting Siamese-Based 3D Single Object Tracking With a Versatile Transformer.

作者信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献