用于安全元模仿学习的建模任务不确定性

Modeling Task Uncertainty for Safe Meta-Imitation Learning.

作者信息

Matsushima Tatsuya, Kondo Naruya, Iwasawa Yusuke, Nasuno Kaoru, Matsuo Yutaka

机构信息

School of Engineering, The University of Tokyo, Tokyo, Japan.

DeepX Inc., Tokyo, Japan.

出版信息

Front Robot AI. 2020 Nov 27;7:606361. doi: 10.3389/frobt.2020.606361. eCollection 2020.

DOI:10.3389/frobt.2020.606361

PMID:33501364

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7805769/

Abstract

To endow robots with the flexibility to perform a wide range of tasks in diverse and complex environments, learning their controller from experience data is a promising approach. In particular, some recent meta-learning methods are shown to solve novel tasks by leveraging their experience of performing other tasks during training. Although studies around meta-learning of robot control have worked on improving the performance, the safety issue has not been fully explored, which is also an important consideration in the deployment. In this paper, we firstly relate uncertainty on task inference with the safety in meta-learning of visual imitation, and then propose a novel framework for estimating the task uncertainty through probabilistic inference in the task-embedding space, called PETNet. We validate PETNet with a manipulation task with a simulated robot arm in terms of the task performance and uncertainty evaluation on task inference. Following the standard benchmark procedure in meta-imitation learning, we show PETNet can achieve the same or higher level of performance (success rate of novel tasks at meta-test time) as previous methods. In addition, by testing PETNet with semantically inappropriate or synthesized out-of-distribution demonstrations, PETNet shows the ability to capture the uncertainty about the tasks inherent in the given demonstrations, which allows the robot to identify situations where the controller might not perform properly. These results illustrate our proposal takes a significant step forward to the safe deployment of robot learning systems into diverse tasks and environments.

摘要

为了使机器人能够在多样且复杂的环境中灵活执行各种任务，从经验数据中学习其控制器是一种很有前景的方法。特别是，最近的一些元学习方法被证明可以通过利用其在训练期间执行其他任务的经验来解决新任务。尽管围绕机器人控制的元学习研究致力于提高性能，但安全问题尚未得到充分探索，而这也是部署中的一个重要考虑因素。在本文中，我们首先将任务推理中的不确定性与视觉模仿元学习中的安全性联系起来，然后提出一种新颖的框架，通过在任务嵌入空间中进行概率推理来估计任务不确定性，称为PETNet。我们在任务性能和任务推理的不确定性评估方面，用一个模拟机器人手臂的操纵任务对PETNet进行了验证。按照元模仿学习中的标准基准程序，我们表明PETNet可以达到与先前方法相同或更高的性能水平（元测试时新任务的成功率）。此外，通过用语义上不适当或合成的分布外演示测试PETNet，PETNet显示出能够捕捉给定演示中固有任务的不确定性的能力，这使得机器人能够识别控制器可能无法正常执行的情况。这些结果表明，我们的提议朝着将机器人学习系统安全部署到各种任务和环境中迈出了重要一步。