强化学习在改进智能体设计中的应用。

Reinforcement Learning for Improving Agent Design.

机构信息

Google Brain, Tokyo, Japan.

出版信息

Artif Life. 2019 Fall;25(4):352-365. doi: 10.1162/artl_a_00301. Epub 2019 Nov 7.

Abstract

In many reinforcement learning tasks, the goal is to learn a policy to manipulate an agent, whose design is fixed, to maximize some notion of cumulative reward. The design of the agent's physical structure is rarely optimized for the task at hand. In this work, we explore the possibility of learning a version of the agent's design that is better suited for its task, jointly with the policy. We propose an alteration to the popular OpenAI Gym framework, where we parameterize parts of an environment, and allow an agent to jointly learn to modify these environment parameters along with its policy. We demonstrate that an agent can learn a better structure of its body that is not only better suited for the task, but also facilitates policy learning. Joint learning of policy and structure may even uncover design principles that are useful for assisted-design applications.

摘要

在许多强化学习任务中，目标是学习一个策略来操纵一个代理，其设计是固定的，以最大化某种累积奖励的概念。代理的物理结构的设计很少针对手头的任务进行优化。在这项工作中，我们探索了学习代理设计的一种版本的可能性，该版本与策略一起更好地适应其任务。我们对流行的 OpenAI Gym 框架进行了修改，在该框架中，我们参数化环境的一部分，并允许代理共同学习修改这些环境参数及其策略。我们证明，代理可以学习到更好的身体结构，不仅更适合任务，而且还可以促进策略学习。策略和结构的联合学习甚至可以揭示出对于辅助设计应用有用的设计原则。

相似文献

Reinforcement Learning for Improving Agent Design.

Artif Life. 2019 Fall;25(4):352-365. doi: 10.1162/artl_a_00301. Epub 2019 Nov 7.

LJIR: Learning Joint-Action Intrinsic Reward in cooperative multi-agent reinforcement learning.

Neural Netw. 2023 Oct;167:450-459. doi: 10.1016/j.neunet.2023.08.016. Epub 2023 Aug 22.

A reinforcement learning algorithm acquires demonstration from the training agent by dividing the task space.

Neural Netw. 2023 Jul;164:419-427. doi: 10.1016/j.neunet.2023.04.042. Epub 2023 May 5.

A Hybrid Online Off-Policy Reinforcement Learning Agent Framework Supported by Transformers.

Int J Neural Syst. 2023 Dec;33(12):2350065. doi: 10.1142/S012906572350065X. Epub 2023 Oct 20.

Self-Supervised Discovering of Interpretable Features for Reinforcement Learning.

IEEE Trans Pattern Anal Mach Intell. 2022 May;44(5):2712-2724. doi: 10.1109/TPAMI.2020.3037898. Epub 2022 Apr 1.

Human-level control through deep reinforcement learning.

Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.

Human locomotion with reinforcement learning using bioinspired reward reshaping strategies.

Med Biol Eng Comput. 2021 Jan;59(1):243-256. doi: 10.1007/s11517-020-02309-3. Epub 2021 Jan 8.

Stochastic abstract policies: generalizing knowledge to improve reinforcement learning.

IEEE Trans Cybern. 2015 Jan;45(1):77-88. doi: 10.1109/TCYB.2014.2319733. Epub 2014 May 13.

Meta attention for Off-Policy Actor-Critic.

Neural Netw. 2023 Jun;163:86-96. doi: 10.1016/j.neunet.2023.03.024. Epub 2023 Mar 28.

Fast reinforcement learning with generalized policy updates.

Proc Natl Acad Sci U S A. 2020 Dec 1;117(48):30079-30087. doi: 10.1073/pnas.1907370117. Epub 2020 Aug 17.

引用本文的文献

Novel reinforcement learning technique based parameter estimation for proton exchange membrane fuel cell model.

Sci Rep. 2024 Nov 11;14(1):27475. doi: 10.1038/s41598-024-78001-5.

Biological Robots: Perspectives on an Emerging Interdisciplinary Field.

Soft Robot. 2023 Aug;10(4):674-686. doi: 10.1089/soro.2022.0142. Epub 2023 Apr 20.

Evolution of Brains and Computers: The Roads Not Taken.

Entropy (Basel). 2022 May 9;24(5):665. doi: 10.3390/e24050665.

Embodied intelligence via learning and evolution.

Nat Commun. 2021 Oct 6;12(1):5721. doi: 10.1038/s41467-021-25874-z.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

强化学习在改进智能体设计中的应用。

Reinforcement Learning for Improving Agent Design.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

强化学习在改进智能体设计中的应用。

Reinforcement Learning for Improving Agent Design.

机构信息

出版信息

相似文献

引用本文的文献