Gundogdu Erhan, Alatan A Aydin
IEEE Trans Image Process. 2018 Feb 14. doi: 10.1109/TIP.2018.2806280.
During the recent years, correlation filters have shown dominant and spectacular results for visual object tracking. The types of the features that are employed in these family of trackers significantly affect the performance of visual tracking. The ultimate goal is to utilize robust features invariant to any kind of appearance change of the object, while predicting the object location as properly as in the case of no appearance change. As the deep learning based methods have emerged, the study of learning features for specific tasks has accelerated. For instance, discriminative visual tracking methods based on deep architectures have been studied with promising performance. Nevertheless, correlation filter based (CFB) trackers confine themselves to use the pre-trained networks which are trained for object classification problem. To this end, in this manuscript the problem of learning deep fully convolutional features for the CFB visual tracking is formulated. In order to learn the proposed model, a novel and efficient backpropagation algorithm is presented based on the loss function of the network. The proposed learning framework enables the network model to be flexible for a custom design. Moreover, it alleviates the dependency on the network trained for classification. Extensive performance analysis shows the efficacy of the proposed custom design in the CFB tracking framework. By fine-tuning the convolutional parts of a state-of-the-art network and integrating this model to a CFB tracker, which is the top performing one of VOT2016, 18% increase is achieved in terms of expected average overlap, and tracking failures are decreased by 25%, while maintaining the superiority over the state-of-the-art methods in OTB-2013 and OTB-2015 tracking datasets.
近年来,相关滤波器在视觉目标跟踪方面展现出了卓越且惊人的效果。这些跟踪器家族所采用的特征类型对视觉跟踪性能有着显著影响。最终目标是利用对物体任何外观变化都具有不变性的鲁棒特征,同时能像在物体无外观变化的情况下一样准确地预测物体位置。随着基于深度学习的方法出现,针对特定任务的特征学习研究加速了。例如,基于深度架构的判别式视觉跟踪方法已得到研究,且性能 promising。然而,基于相关滤波器(CFB)的跟踪器局限于使用为物体分类问题训练的预训练网络。为此,在本手稿中,提出了针对CFB视觉跟踪学习深度全卷积特征的问题。为了学习所提出的模型,基于网络的损失函数提出了一种新颖且高效的反向传播算法。所提出的学习框架使网络模型能够灵活地进行定制设计。此外,它减轻了对为分类训练的网络的依赖。广泛的性能分析表明了所提出的定制设计在CFB跟踪框架中的有效性。通过微调一个先进网络的卷积部分并将此模型集成到一个CFB跟踪器(它是VOT2016中性能最佳的跟踪器之一)中,在预期平均重叠率方面实现了18%的提升,跟踪失败率降低了25%,同时在OTB - 2013和OTB - 2015跟踪数据集中保持了相对于现有方法的优势。