Suppr超能文献

用于序列任务的多通道交互式强化学习

Multi-Channel Interactive Reinforcement Learning for Sequential Tasks.

作者信息

Koert Dorothea, Kircher Maximilian, Salikutluk Vildan, D'Eramo Carlo, Peters Jan

机构信息

Intelligent Autonomous Systems Group, Department of Computer Science, Technische Universität Darmstadt, Darmstadt, Germany.

Center for Cognitive Science, Technische Universität Darmstadt, Darmstadt, Germany.

出版信息

Front Robot AI. 2020 Sep 24;7:97. doi: 10.3389/frobt.2020.00097. eCollection 2020.

Abstract

The ability to learn new tasks by sequencing already known skills is an important requirement for future robots. Reinforcement learning is a powerful tool for this as it allows for a robot to learn and improve on how to combine skills for sequential tasks. However, in real robotic applications, the cost of sample collection and exploration prevent the application of reinforcement learning for a variety of tasks. To overcome these limitations, human input during reinforcement can be beneficial to speed up learning, guide the exploration and prevent the choice of disastrous actions. Nevertheless, there is a lack of experimental evaluations of multi-channel interactive reinforcement learning systems solving robotic tasks with input from inexperienced human users, in particular for cases where human input might be partially wrong. Therefore, in this paper, we present an approach that incorporates multiple human input channels for interactive reinforcement learning in a unified framework and evaluate it on two robotic tasks with 20 inexperienced human subjects. To enable the robot to also handle potentially incorrect human input we incorporate a novel concept for self-confidence, which allows the robot to question human input after an initial learning phase. The second robotic task is specifically designed to investigate if this self-confidence can enable the robot to achieve learning progress even if the human input is partially incorrect. Further, we evaluate how humans react to suggestions of the robot, once the robot notices human input might be wrong. Our experimental evaluations show that our approach can successfully incorporate human input to accelerate the learning process in both robotic tasks even if it is partially wrong. However, not all humans were willing to accept the robot's suggestions or its questioning of their input, particularly if they do not understand the learning process and the reasons behind the robot's suggestions. We believe that the findings from this experimental evaluation can be beneficial for the future design of algorithms and interfaces of interactive reinforcement learning systems used by inexperienced users.

摘要

通过对已知技能进行排序来学习新任务的能力是未来机器人的一项重要要求。强化学习是实现这一目标的有力工具,因为它能使机器人学习并改进如何将技能组合用于顺序任务。然而,在实际的机器人应用中,样本收集和探索的成本阻碍了强化学习在各种任务中的应用。为克服这些限制,强化学习过程中的人类输入有助于加快学习速度、引导探索并防止选择灾难性行动。尽管如此,对于多通道交互式强化学习系统在缺乏经验的人类用户输入下解决机器人任务,尤其是在人类输入可能部分错误的情况下,缺乏实验评估。因此,在本文中,我们提出一种方法,该方法在统一框架中纳入多个用于交互式强化学习的人类输入通道,并在两项机器人任务中对20名缺乏经验的人类受试者进行评估。为使机器人也能处理可能不正确的人类输入,我们引入了一种新的自信概念,这使机器人能够在初始学习阶段后对人类输入提出质疑。第二项机器人任务专门设计用于研究这种自信是否能使机器人即使在人类输入部分错误的情况下也能取得学习进展。此外,我们评估一旦机器人注意到人类输入可能错误时人类对机器人建议的反应。我们的实验评估表明,即使人类输入部分错误,我们的方法也能成功纳入人类输入以加速两项机器人任务的学习过程。然而,并非所有人类都愿意接受机器人的建议或其对输入的质疑,特别是当他们不理解学习过程以及机器人建议背后的原因时。我们相信,这项实验评估的结果对未来缺乏经验的用户使用的交互式强化学习系统的算法和界面设计可能有益。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/595f/7805623/1e9f8332996d/frobt-07-00097-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验