基于深度强化学习的驱动式视觉目标跟踪

Action-Driven Visual Object Tracking With Deep Reinforcement Learning.

出版信息

IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2239-2252. doi: 10.1109/TNNLS.2018.2801826.

DOI:10.1109/TNNLS.2018.2801826

Abstract

In this paper, we propose an efficient visual tracker, which directly captures a bounding box containing the target object in a video by means of sequential actions learned using deep neural networks. The proposed deep neural network to control tracking actions is pretrained using various training video sequences and fine-tuned during actual tracking for online adaptation to a change of target and background. The pretraining is done by utilizing deep reinforcement learning (RL) as well as supervised learning. The use of RL enables even partially labeled data to be successfully utilized for semisupervised learning. Through the evaluation of the object tracking benchmark data set, the proposed tracker is validated to achieve a competitive performance at three times the speed of existing deep network-based trackers. The fast version of the proposed method, which operates in real time on graphics processing unit, outperforms the state-of-the-art real-time trackers with an accuracy improvement of more than 8%.

摘要

在本文中，我们提出了一种高效的视觉跟踪器，它通过使用深度神经网络学习的顺序动作直接在视频中捕获包含目标对象的边界框。所提出的用于控制跟踪动作的深度神经网络使用各种训练视频序列进行预训练，并在实际跟踪过程中进行微调，以在线适应目标和背景的变化。预训练是通过使用深度强化学习（RL）和监督学习来完成的。RL 的使用使得即使是部分标记的数据也可以成功地用于半监督学习。通过对目标跟踪基准数据集的评估，验证了所提出的跟踪器在速度上是现有基于深度网络的跟踪器的三倍，性能具有竞争力。实时运行在图形处理单元上的快速版本，其精度提高了 8%以上，超过了最先进的实时跟踪器。