• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于鲁棒视觉目标跟踪的多时空关系建模

Modeling of Multiple Spatial-Temporal Relations for Robust Visual Object Tracking.

作者信息

Wang Shilei, Wang Zhenhua, Sun Qianqian, Cheng Gong, Ning Jifeng

出版信息

IEEE Trans Image Process. 2024;33:5073-5085. doi: 10.1109/TIP.2024.3453028. Epub 2024 Sep 17.

DOI:10.1109/TIP.2024.3453028
PMID:39250370
Abstract

Recently, one-stream trackers have achieved parallel feature extraction and relation modeling through the exploitation of Transformer-based architectures. This design greatly improves the performance of trackers. However, as one-stream trackers often overlook crucial tracking cues beyond the template, they prone to give unsatisfactory results against complex tracking scenarios. To tackle these challenges, we propose a multi-cue single-stream tracker, dubbed MCTrack here, which seamlessly integrates template information, historical trajectory, historical frame, and the search region for synchronized feature extraction and relation modeling. To achieve this, we employ two types of encoders to convert the template, historical frames, search region, and historical trajectory into tokens, which are then collectively fed into a Transformer architecture. To distill temporal and spatial cues, we introduce a novel adaptive update mechanism, which incorporates a thresholding component and a local multi-peak component to filter out less accurate and overly disturbed tracking cues. Empirically, MCTrack achieves leading performance on mainstream benchmark datasets, surpassing the most advanced SeqTrack by 2.0% in terms of the AO metric on GOT-10k. The code is available at https://github.com/wsumel/MCTrack.

摘要

最近,单流跟踪器通过利用基于Transformer的架构实现了并行特征提取和关系建模。这种设计大大提高了跟踪器的性能。然而,由于单流跟踪器经常忽略模板之外的关键跟踪线索,在面对复杂的跟踪场景时,它们容易给出不尽人意的结果。为应对这些挑战,我们提出了一种多线索单流跟踪器,在此称为MCTrack,它无缝集成了模板信息、历史轨迹、历史帧和搜索区域,以进行同步特征提取和关系建模。为实现这一点,我们使用两种类型的编码器将模板、历史帧、搜索区域和历史轨迹转换为令牌,然后将这些令牌一起输入到Transformer架构中。为了提取时空线索,我们引入了一种新颖的自适应更新机制,该机制包含一个阈值组件和一个局部多峰组件,以滤除不太准确和受干扰过大的跟踪线索。根据经验,MCTrack在主流基准数据集上取得了领先性能,在GOT-10k数据集上,就平均重叠率(AO)指标而言,比最先进的SeqTrack高出2.0%。代码可在https://github.com/wsumel/MCTrack获取。

相似文献

1
Modeling of Multiple Spatial-Temporal Relations for Robust Visual Object Tracking.用于鲁棒视觉目标跟踪的多时空关系建模
IEEE Trans Image Process. 2024;33:5073-5085. doi: 10.1109/TIP.2024.3453028. Epub 2024 Sep 17.
2
AMST: aggregated multi-level spatial and temporal context-based transformer for robust aerial tracking.基于聚合多层次时空上下文的Transformer 模型用于稳健的空中目标跟踪
Sci Rep. 2023 Jun 4;13(1):9062. doi: 10.1038/s41598-023-36131-2.
3
Correlation-Embedded Transformer Tracking: A Single-Branch Framework.关联嵌入Transformer跟踪:单分支框架
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):10681-10696. doi: 10.1109/TPAMI.2024.3448254. Epub 2024 Nov 6.
4
Adaptive sparse attention-based compact transformer for object tracking.用于目标跟踪的基于自适应稀疏注意力的紧凑型变压器
Sci Rep. 2024 May 28;14(1):12256. doi: 10.1038/s41598-024-63028-5.
5
Exploring Multi-Modal Spatial-Temporal Contexts for High-Performance RGB-T Tracking.探索用于高性能RGB-T跟踪的多模态时空上下文
IEEE Trans Image Process. 2024;33:4303-4318. doi: 10.1109/TIP.2024.3428316. Epub 2024 Jul 30.
6
Learning Dynamic Compact Memory Embedding for Deformable Visual Object Tracking.学习用于可变形视觉目标跟踪的动态紧凑内存嵌入
IEEE Trans Neural Netw Learn Syst. 2024 Apr;35(4):5656-5670. doi: 10.1109/TNNLS.2022.3208605. Epub 2024 Apr 4.
7
CVTrack: Combined Convolutional Neural Network and Vision Transformer Fusion Model for Visual Tracking.CVTrack:用于视觉跟踪的卷积神经网络与视觉Transformer融合模型
Sensors (Basel). 2024 Jan 3;24(1):274. doi: 10.3390/s24010274.
8
EMAT: Efficient feature fusion network for visual tracking via optimized multi-head attention.EMAT:通过优化多头注意力进行视觉跟踪的高效特征融合网络。
Neural Netw. 2024 Apr;172:106110. doi: 10.1016/j.neunet.2024.106110. Epub 2024 Jan 6.
9
Transformer Feature Enhancement Network with Template Update for Object Tracking.基于模板更新的 Transformer 特征增强网络的目标跟踪方法。
Sensors (Basel). 2022 Jul 12;22(14):5219. doi: 10.3390/s22145219.
10
Graph Attention Network for Context-Aware Visual Tracking.用于上下文感知视觉跟踪的图注意力网络
IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):9474-9487. doi: 10.1109/TNNLS.2024.3442290. Epub 2025 May 2.