Zhao Fei, Zhang Ting, Song Yibing, Tang Ming, Wang Xiaobo, Wang Jinqiao
IEEE Trans Image Process. 2021;30:628-640. doi: 10.1109/TIP.2020.3036723. Epub 2020 Dec 4.
Siamese networks are prevalent in visual tracking because of the efficient localization. The networks take both a search patch and a target template as inputs where the target template is usually from the initial frame. Meanwhile, Siamese trackers do not update network parameters online for real-time efficiency. The fixed target template and CNN parameters make Siamese trackers not effective to capture target appearance variations. In this paper, we propose a template updating method via reinforcement learning for Siamese regression trackers. We collect a series of templates and learn to maintain them based on an actor-critic framework. Among this framework, the actor network that is trained by deep reinforcement learning effectively updates the templates based on the tracking result on each frame. Besides the target template, we update the Siamese regression tracker online to adapt to target appearance variations. The experimental results on the standard benchmarks show the effectiveness of both template and network updating. The proposed tracker SiamRTU performs favorably against state-of-the-art approaches.
连体网络由于其高效的定位能力在视觉跟踪中很普遍。这些网络将一个搜索补丁和一个目标模板作为输入,其中目标模板通常来自初始帧。同时,连体跟踪器不会为了实时效率而在线更新网络参数。固定的目标模板和卷积神经网络参数使得连体跟踪器在捕捉目标外观变化方面效果不佳。在本文中,我们为连体回归跟踪器提出了一种通过强化学习进行模板更新的方法。我们收集一系列模板,并基于一个演员-评论家框架学习维护它们。在这个框架中,通过深度强化学习训练的演员网络根据每一帧的跟踪结果有效地更新模板。除了目标模板,我们还在线更新连体回归跟踪器以适应目标外观变化。在标准基准上的实验结果表明了模板和网络更新的有效性。所提出的跟踪器SiamRTU比现有技术方法表现更优。