作为强化学习中知识迁移方法的学习预测结果。

Learning to Predict Consequences as a Method of Knowledge Transfer in Reinforcement Learning.

出版信息

IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2259-2270. doi: 10.1109/TNNLS.2017.2690910. Epub 2017 Apr 17.

DOI:10.1109/TNNLS.2017.2690910

Abstract

The reinforcement learning (RL) paradigm allows agents to solve tasks through trial-and-error learning. To be capable of efficient, long-term learning, RL agents should be able to apply knowledge gained in the past to new tasks they may encounter in the future. The ability to predict actions' consequences may facilitate such knowledge transfer. We consider here domains where an RL agent has access to two kinds of information: agent-centric information with constant semantics across tasks, and environment-centric information, which is necessary to solve the task, but with semantics that differ between tasks. For example, in robot navigation, environment-centric information may include the robot's geographic location, while agent-centric information may include sensor readings of various nearby obstacles. We propose that these situations provide an opportunity for a very natural style of knowledge transfer, in which the agent learns to predict actions' environmental consequences using agent-centric information. These predictions contain important information about the affordances and dangers present in a novel environment, and can effectively transfer knowledge from agent-centric to environment-centric learning systems. Using several example problems including spatial navigation and network routing, we show that our knowledge transfer approach can allow faster and lower cost learning than existing alternatives.

摘要

强化学习 (RL) 范式允许代理通过试错学习来解决任务。为了能够进行高效、长期的学习，RL 代理应该能够将过去获得的知识应用于未来可能遇到的新任务。预测行动后果的能力可能有助于这种知识转移。在这里，我们考虑 RL 代理可以访问两种信息的领域：以代理为中心的信息，其语义在任务之间保持不变，以及以环境为中心的信息，这是解决任务所必需的，但在任务之间语义不同。例如，在机器人导航中，以环境为中心的信息可能包括机器人的地理位置，而以代理为中心的信息可能包括附近各种障碍物的传感器读数。我们提出，这些情况为一种非常自然的知识转移方式提供了机会，其中代理学会使用以代理为中心的信息来预测动作对环境的影响。这些预测包含了关于新环境中存在的功能和危险的重要信息，可以有效地将知识从以代理为中心的学习系统转移到以环境为中心的学习系统。我们使用包括空间导航和网络路由在内的几个示例问题，表明我们的知识转移方法可以比现有替代方法更快、成本更低地进行学习。

相似文献

Learning to Predict Consequences as a Method of Knowledge Transfer in Reinforcement Learning.作为强化学习中知识迁移方法的学习预测结果。

IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2259-2270. doi: 10.1109/TNNLS.2017.2690910. Epub 2017 Apr 17.

Multisource Transfer Double DQN Based on Actor Learning.基于演员学习的多源转移双 DQN。

IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2227-2238. doi: 10.1109/TNNLS.2018.2806087.

Modeling of autonomous problem solving process by dynamic construction of task models in multiple tasks environment.在多任务环境中通过动态构建任务模型对自主问题解决过程进行建模。

Neural Netw. 2006 Oct;19(8):1169-80. doi: 10.1016/j.neunet.2006.05.037. Epub 2006 Sep 20.

Self-organizing neural networks integrating domain knowledge and reinforcement learning.自组织神经网络集成领域知识和强化学习。

IEEE Trans Neural Netw Learn Syst. 2015 May;26(5):889-902. doi: 10.1109/TNNLS.2014.2327636.

Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。

Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.

Ensemble algorithms in reinforcement learning.强化学习中的集成算法。

IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):930-6. doi: 10.1109/TSMCB.2008.920231.

Reinforcement learning algorithms for robotic navigation in dynamic environments.用于动态环境中机器人导航的强化学习算法。

ISA Trans. 2004 Apr;43(2):217-30. doi: 10.1016/s0019-0578(07)60032-9.

RL-DOVS: Reinforcement Learning for Autonomous Robot Navigation in Dynamic Environments.RL-DOVS：动态环境下自主机器人导航的强化学习。

Sensors (Basel). 2022 May 19;22(10):3847. doi: 10.3390/s22103847.

Vision-Based Robot Navigation through Combining Unsupervised Learning and Hierarchical Reinforcement Learning.基于视觉的机器人导航，通过结合无监督学习和分层强化学习。

Sensors (Basel). 2019 Apr 1;19(7):1576. doi: 10.3390/s19071576.

Emergent Solutions to High-Dimensional Multitask Reinforcement Learning.高维多任务强化学习的应急解决方案。

Evol Comput. 2018 Fall;26(3):347-380. doi: 10.1162/evco_a_00232. Epub 2018 Jun 22.

引用本文的文献

Epileptic seizures and link to memory processes.癫痫发作及其与记忆过程的关联。

AIMS Neurosci. 2022 Mar 7;9(1):114-127. doi: 10.3934/Neuroscience.2022007. eCollection 2022.

The Neuroscience of Spatial Navigation and the Relationship to Artificial Intelligence.空间导航的神经科学及其与人工智能的关系。

Front Comput Neurosci. 2020 Jul 28;14:63. doi: 10.3389/fncom.2020.00063. eCollection 2020.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

作为强化学习中知识迁移方法的学习预测结果。

Learning to Predict Consequences as a Method of Knowledge Transfer in Reinforcement Learning.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献