端到端主动目标跟踪及其通过强化学习的实际部署。

End-to-End Active Object Tracking and Its Real-World Deployment via Reinforcement Learning.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2020 Jun;42(6):1317-1332. doi: 10.1109/TPAMI.2019.2899570. Epub 2019 Feb 14.

DOI:10.1109/TPAMI.2019.2899570

Abstract

We study active object tracking, where a tracker takes visual observations (i.e., frame sequences) as input and produces the corresponding camera control signals as output (e.g., move forward, turn left, etc.). Conventional methods tackle tracking and camera control tasks separately, and the resulting system is difficult to tune jointly. These methods also require significant human efforts for image labeling and expensive trial-and-error system tuning in the real world. To address these issues, we propose, in this paper, an end-to-end solution via deep reinforcement learning. A ConvNet-LSTM function approximator is adopted for the direct frame-to-action prediction. We further propose an environment augmentation technique and a customized reward function, which are crucial for successful training. The tracker trained in simulators (ViZDoom and Unreal Engine) demonstrates good generalization behaviors in the case of unseen object moving paths, unseen object appearances, unseen backgrounds, and distracting objects. The system is robust and can restore tracking after occasional lost of the target being tracked. We also find that the tracking ability, obtained solely from simulators, can potentially transfer to real-world scenarios. We demonstrate successful examples of such transfer, via experiments over the VOT dataset and the deployment of a real-world robot using the proposed active tracker trained in simulation.

摘要

我们研究主动目标跟踪，其中跟踪器以视觉观察（即帧序列）作为输入，并生成相应的相机控制信号作为输出（例如，前进、左转等）。传统方法分别解决跟踪和相机控制任务，并且难以联合调整由此产生的系统。这些方法还需要大量的人力进行图像标记，并且在现实世界中需要昂贵的反复试验系统调整。为了解决这些问题，我们在本文中提出了一种通过深度强化学习的端到端解决方案。采用 ConvNet-LSTM 函数逼近器进行直接帧到动作预测。我们进一步提出了一种环境增强技术和定制的奖励函数，这对于成功训练至关重要。在看不见的目标运动路径、看不见的目标外观、看不见的背景和干扰物体的情况下，在模拟器（ViZDoom 和 Unreal Engine）中训练的跟踪器表现出良好的泛化行为。该系统具有鲁棒性，可以在偶尔丢失跟踪目标后恢复跟踪。我们还发现，仅从模拟器获得的跟踪能力可能会潜在地转移到真实场景中。我们通过在 VOT 数据集上进行的实验以及使用在模拟中训练的主动跟踪器在真实机器人上的部署，展示了这种转移的成功示例。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

端到端主动目标跟踪及其通过强化学习的实际部署。

End-to-End Active Object Tracking and Its Real-World Deployment via Reinforcement Learning.

出版信息

相似文献

引用本文的文献

端到端主动目标跟踪及其通过强化学习的实际部署。

End-to-End Active Object Tracking and Its Real-World Deployment via Reinforcement Learning.

出版信息

相似文献

引用本文的文献