IEEE Trans Cybern. 2017 Oct;47(10):3172-3183. doi: 10.1109/TCYB.2017.2705345.
Motion models have been proved to be a crucial part in the visual tracking process. In recent trackers, particle filter and sliding windows-based motion models have been widely used. Treating motion models as a sequence prediction problem, we can estimate the motion of objects using their trajectories. Moreover, it is possible to transfer the learned knowledge from annotated trajectories to new objects. Inspired by recent advance in deep learning for visual feature extraction and sequence prediction, we propose a trajectory predictor to learn prior knowledge from annotated trajectories and transfer it to predict the motion of target objects. In this predictor, convolutional neural networks extract the visual features of target objects. Long short-term memory model leverages the annotated trajectory priors as well as sequential visual information, which includes the tracked features and center locations of the target object, to predict the motion. Furthermore, to extend this method to videos in which it is difficult to obtain annotated trajectories, a dynamic weighted motion model that combines the proposed trajectory predictor with a random sampler is proposed. To evaluate the transfer performance of the proposed trajectory predictor, we annotated a real-world vehicle dataset. Experiment results on both this real-world vehicle dataset and an online tracker benchmark dataset indicate that the proposed method outperforms several state-of-the-art trackers.
运动模型已被证明是视觉跟踪过程中的关键部分。在最近的跟踪器中,粒子滤波器和基于滑动窗口的运动模型被广泛使用。将运动模型视为序列预测问题,我们可以使用物体的轨迹来估计它们的运动。此外,还可以将从带注释轨迹中学习到的知识转移到新物体上。受深度学习在视觉特征提取和序列预测方面的最新进展的启发,我们提出了一种轨迹预测器,以从带注释的轨迹中学习先验知识并将其转移到预测目标物体的运动中。在该预测器中,卷积神经网络提取目标物体的视觉特征。长短期记忆模型利用带注释的轨迹先验以及序列视觉信息,包括跟踪特征和目标物体的中心位置,来预测运动。此外,为了将这种方法扩展到难以获得带注释轨迹的视频中,提出了一种动态加权运动模型,将所提出的轨迹预测器与随机抽样器相结合。为了评估所提出的轨迹预测器的迁移性能,我们对一个真实世界的车辆数据集进行了注释。在这个真实世界的车辆数据集和一个在线跟踪器基准数据集上的实验结果表明,该方法优于几个最先进的跟踪器。