从语义到执行：将动作规划与强化学习相结合以解决机器人因果问题

From Semantics to Execution: Integrating Action Planning With Reinforcement Learning for Robotic Causal Problem-Solving.

作者信息

Eppe Manfred, Nguyen Phuong D H, Wermter Stefan

机构信息

Department of Informatics, Knowledge Technology Institute, Universität Hamburg, Hamburg, Germany.

出版信息

Front Robot AI. 2019 Nov 26;6:123. doi: 10.3389/frobt.2019.00123. eCollection 2019.

DOI:10.3389/frobt.2019.00123

PMID:33501138

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7805615/

Abstract

Reinforcement learning is generally accepted to be an appropriate and successful method to learn robot control. Symbolic action planning is useful to resolve causal dependencies and to break a causally complex problem down into a sequence of simpler high-level actions. A problem with the integration of both approaches is that action planning is based on , whereas reinforcement learning is usually driven by a function. Recent advances in model-free reinforcement learning, specifically, universal value function approximators and hindsight experience replay, have focused on goal-independent methods based on that are only given at the end of a rollout, and only if the goal has been fully achieved. In this article, we build on these novel methods to facilitate the integration of action planning with model-free reinforcement learning. Specifically, the paper demonstrates how the reward-sparsity can serve as a bridge between the high-level and low-level state- and action spaces. As a result, we demonstrate that the integrated method is able to solve robotic tasks that involve non-trivial causal dependencies under noisy conditions, exploiting both data and knowledge.

摘要

强化学习通常被认为是一种学习机器人控制的合适且成功的方法。符号动作规划有助于解决因果依赖关系，并将因果复杂的问题分解为一系列更简单的高级动作。将这两种方法结合起来的一个问题是，动作规划基于，而强化学习通常由一个函数驱动。无模型强化学习的最新进展，特别是通用价值函数逼近器和事后经验回放，专注于基于且仅在一次展开结束时（并且只有在目标完全实现时）才给出的与目标无关的方法。在本文中，我们基于这些新方法来促进动作规划与无模型强化学习的整合。具体而言，本文展示了奖励稀疏性如何能够作为高级和低级状态及动作空间之间的桥梁。结果，我们证明了这种集成方法能够在噪声条件下解决涉及非平凡因果依赖关系的机器人任务，同时利用数据和知识。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a92/7805615/85cd89a9b98a/frobt-06-00123-g0001.jpg

相似文献

From Semantics to Execution: Integrating Action Planning With Reinforcement Learning for Robotic Causal Problem-Solving.从语义到执行：将动作规划与强化学习相结合以解决机器人因果问题

Front Robot AI. 2019 Nov 26;6:123. doi: 10.3389/frobt.2019.00123. eCollection 2019.

Sampling Rate Decay in Hindsight Experience Replay for Robot Control.事后经验回放中机器人控制的采样率衰减。

IEEE Trans Cybern. 2022 Mar;52(3):1515-1526. doi: 10.1109/TCYB.2020.2990722. Epub 2022 Mar 11.

Learning to reach by reinforcement learning using a receptive field based function approximation approach with continuous actions.使用基于感受野的函数逼近方法并结合连续动作，通过强化学习来学习伸手动作。

Biol Cybern. 2009 Mar;100(3):249-60. doi: 10.1007/s00422-009-0295-8. Epub 2009 Feb 20.

Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。

Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.

Integrated Cognitive Architecture for Robot Learning of Action and Language.用于机器人动作与语言学习的集成认知架构

Front Robot AI. 2019 Nov 29;6:131. doi: 10.3389/frobt.2019.00131. eCollection 2019.

Modular deep reinforcement learning from reward and punishment for robot navigation.基于奖惩的机器人导航模块化深度强化学习。

Neural Netw. 2021 Mar;135:115-126. doi: 10.1016/j.neunet.2020.12.001. Epub 2020 Dec 8.

Improved Robot Path Planning Method Based on Deep Reinforcement Learning.基于深度强化学习的改进型机器人路径规划方法。

Sensors (Basel). 2023 Jun 15;23(12):5622. doi: 10.3390/s23125622.

Credit Assignment in a Motor Decision Making Task Is Influenced by Agency and Not Sensory Prediction Errors.在一项运动决策任务中，信用分配受机构影响，而不受感官预测误差影响。

J Neurosci. 2018 May 9;38(19):4521-4530. doi: 10.1523/JNEUROSCI.3601-17.2018. Epub 2018 Apr 12.

Hierarchical Reinforcement Learning With Universal Policies for Multistep Robotic Manipulation.用于多步机器人操作的具有通用策略的分层强化学习

IEEE Trans Neural Netw Learn Syst. 2022 Sep;33(9):4727-4741. doi: 10.1109/TNNLS.2021.3059912. Epub 2022 Sep 1.

The Path Planning of Mobile Robot by Neural Networks and Hierarchical Reinforcement Learning.基于神经网络和分层强化学习的移动机器人路径规划

Front Neurorobot. 2020 Oct 2;14:63. doi: 10.3389/fnbot.2020.00063. eCollection 2020.

引用本文的文献

The Embodied Crossmodal Self Forms Language and Interaction: A Computational Cognitive Review.具身跨模态自我塑造语言与互动：计算认知综述

Front Psychol. 2021 Aug 16;12:716671. doi: 10.3389/fpsyg.2021.716671. eCollection 2021.

本文引用的文献

Open-Ended Learning: A Conceptual Framework Based on Representational Redescription.开放式学习：基于表征重述的概念框架

Front Neurorobot. 2018 Sep 25;12:59. doi: 10.3389/fnbot.2018.00059. eCollection 2018.

State representation learning for control: An overview.状态表示学习控制：概述。

Neural Netw. 2018 Dec;108:379-392. doi: 10.1016/j.neunet.2018.07.006. Epub 2018 Aug 4.

Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。

Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.

Do new caledonian crows solve physical problems through causal reasoning?新喀里多尼亚乌鸦是通过因果推理来解决实际问题的吗？

Proc Biol Sci. 2009 Jan 22;276(1655):247-54. doi: 10.1098/rspb.2008.1107.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

从语义到执行：将动作规划与强化学习相结合以解决机器人因果问题

From Semantics to Execution: Integrating Action Planning With Reinforcement Learning for Robotic Causal Problem-Solving.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献