Suppr超能文献

基于联合相关性与注意力的特征融合网络用于精确视觉跟踪

Joint Correlation and Attention Based Feature Fusion Network for Accurate Visual Tracking.

作者信息

Yang Yijin, Gu Xiaodong

出版信息

IEEE Trans Image Process. 2023;32:1705-1715. doi: 10.1109/TIP.2023.3251027. Epub 2023 Mar 9.

Abstract

Correlation operation and attention mechanism are two popular feature fusion approaches which play an important role in visual object tracking. However, the correlation-based tracking networks are sensitive to location information but loss some context semantics, while the attention-based tracking networks can make full use of rich semantic information but ignore the position distribution of the tracked object. Therefore, in this paper, we propose a novel tracking framework based on joint correlation and attention networks, termed as JCAT, which can effectively combine the advantages of these two complementary feature fusion approaches. Concretely, the proposed JCAT approach adopts parallel correlation and attention branches to generate position and semantic features. Then the fusion features are obtained by directly adding the location feature and semantic feature. Finally, the fused features are fed into the segmentation network to generate the pixel-wise state estimation of the object. Furthermore, we develop a segmentation memory bank and an online sample filtering mechanism for robust segmentation and tracking. The extensive experimental results on eight challenging visual tracking benchmarks show that the proposed JCAT tracker achieves very promising tracking performance and sets a new state-of-the-art on the VOT2018 benchmark.

摘要

相关运算和注意力机制是视觉目标跟踪中两种常用的特征融合方法,它们在视觉目标跟踪中发挥着重要作用。然而,基于相关的跟踪网络对位置信息敏感,但会丢失一些上下文语义,而基于注意力的跟踪网络可以充分利用丰富的语义信息,但忽略了被跟踪目标的位置分布。因此,在本文中,我们提出了一种基于联合相关和注意力网络的新型跟踪框架,称为JCAT,它可以有效地结合这两种互补特征融合方法的优点。具体来说,所提出的JCAT方法采用并行的相关和注意力分支来生成位置和语义特征。然后通过直接相加位置特征和语义特征来获得融合特征。最后,将融合特征输入到分割网络中以生成目标的逐像素状态估计。此外,我们还开发了一个分割记忆库和一个在线样本过滤机制,以实现稳健的分割和跟踪。在八个具有挑战性的视觉跟踪基准上的大量实验结果表明,所提出的JCAT跟踪器取得了非常有前景的跟踪性能,并在VOT2018基准上创造了新的最优成绩。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验