基于强化学习和逆强化学习的蛇形机器人节能与损伤恢复蠕动步态设计。

Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning.

机构信息

Department of Computer Science, Technical University of Munich, Germany.

Department of Computer Science, Ludwig Maximilian University of Munich, Germany.

出版信息

Neural Netw. 2020 Sep;129:323-333. doi: 10.1016/j.neunet.2020.05.029. Epub 2020 Jun 16.

DOI:10.1016/j.neunet.2020.05.029

PMID:32593929

Abstract

Similar to real snakes in nature, the flexible trunks of snake-like robots enhance their movement capabilities and adaptabilities in diverse environments. However, this flexibility corresponds to a complex control task involving highly redundant degrees of freedom, where traditional model-based methods usually fail to propel the robots energy-efficiently and adaptively to unforeseeable joint damage. In this work, we present an approach for designing an energy-efficient and damage-recovery slithering gait for a snake-like robot using the reinforcement learning (RL) algorithm and the inverse reinforcement learning (IRL) algorithm. Specifically, we first present an RL-based controller for generating locomotion gaits at a wide range of velocities, which is trained using the proximal policy optimization (PPO) algorithm. Then, by taking the RL-based controller as an expert and collecting trajectories from it, we train an IRL-based controller using the adversarial inverse reinforcement learning (AIRL) algorithm. For the purpose of comparison, a traditional parameterized gait controller is presented as the baseline and the parameter sets are optimized using the grid search and Bayesian optimization algorithm. Based on the analysis of the simulation results, we first demonstrate that this RL-based controller exhibits very natural and adaptive movements, which are also substantially more energy-efficient than the gaits generated by the parameterized controller. We then demonstrate that the IRL-based controller cannot only exhibit similar performances as the RL-based controller, but can also recover from the unpredictable damage body joints and still outperform the model-based controller, which has an undamaged body, in terms of energy efficiency. Videos can be viewed at https://videoviewsite.wixsite.com/rlsnake.

摘要

类似于自然界中的真实蛇类，蛇形机器人的灵活躯干增强了它们在各种环境中的运动能力和适应性。然而，这种灵活性对应着一个复杂的控制任务，涉及到高度冗余的自由度，传统的基于模型的方法通常无法有效地推动机器人以节能的方式适应不可预见的关节损伤。在这项工作中，我们提出了一种使用强化学习（RL）算法和逆强化学习（IRL）算法为蛇形机器人设计节能和损伤恢复的蠕动步态的方法。具体来说，我们首先提出了一种基于 RL 的控制器，用于在广泛的速度范围内生成运动步态，该控制器使用近端策略优化（PPO）算法进行训练。然后，通过将基于 RL 的控制器作为专家并从其收集轨迹，我们使用对抗性逆强化学习（AIRL）算法训练基于 IRL 的控制器。为了进行比较，我们提出了一种传统的参数化步态控制器作为基线，并使用网格搜索和贝叶斯优化算法对参数集进行优化。基于对模拟结果的分析，我们首先证明了这种基于 RL 的控制器表现出非常自然和自适应的运动，而且比参数化控制器生成的步态更加节能。然后我们证明，基于 IRL 的控制器不仅可以表现出与基于 RL 的控制器相似的性能，还可以从不可预测的损伤身体关节中恢复过来，并且在节能方面仍然优于具有完整身体的基于模型的控制器。视频可以在 https://videoviewsite.wixsite.com/rlsnake 上观看。

相似文献

Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning.

Neural Netw. 2020 Sep;129:323-333. doi: 10.1016/j.neunet.2020.05.029. Epub 2020 Jun 16.

Joint elasticity produces energy efficiency in underwater locomotion: Verification with deep reinforcement learning.

Front Robot AI. 2022 Sep 8;9:957931. doi: 10.3389/frobt.2022.957931. eCollection 2022.

Perception-Action Coupling Target Tracking Control for a Snake Robot via Reinforcement Learning.

Front Neurorobot. 2020 Oct 20;14:591128. doi: 10.3389/fnbot.2020.591128. eCollection 2020.

Towards autonomous locomotion: CPG-based control of smooth 3D slithering gait transition of a snake-like robot.

Bioinspir Biomim. 2017 Apr 4;12(3):035001. doi: 10.1088/1748-3190/aa644c.

Soft-body dynamics induces energy efficiency in undulatory swimming: A deep learning study.

Front Robot AI. 2023 Feb 9;10:1102854. doi: 10.3389/frobt.2023.1102854. eCollection 2023.

A Reinforcement Learning-Based Strategy of Path Following for Snake Robots with an Onboard Camera.

Sensors (Basel). 2022 Dec 15;22(24):9867. doi: 10.3390/s22249867.

In-plane gait planning for earthworm-like metameric robots using genetic algorithm.

Bioinspir Biomim. 2020 Jul 29;15(5):056012. doi: 10.1088/1748-3190/ab97fb.

Walking motion generation, synthesis, and control for biped robot by using PGRL, LPI, and fuzzy logic.

IEEE Trans Syst Man Cybern B Cybern. 2011 Jun;41(3):736-48. doi: 10.1109/TSMCB.2010.2089978. Epub 2010 Nov 18.

Comparing Robot Controller Optimization Methods on Evolvable Morphologies.

Evol Comput. 2024 Jun 3;32(2):105-124. doi: 10.1162/evco_a_00334.

Policy Design for an Ankle-Foot Orthosis Using Simulated Physical Human-Robot Interaction via Deep Reinforcement Learning.

IEEE Trans Neural Syst Rehabil Eng. 2022;30:2186-2197. doi: 10.1109/TNSRE.2022.3196468. Epub 2022 Aug 11.

引用本文的文献

Soft-body dynamics induces energy efficiency in undulatory swimming: A deep learning study.

Front Robot AI. 2023 Feb 9;10:1102854. doi: 10.3389/frobt.2023.1102854. eCollection 2023.

Joint elasticity produces energy efficiency in underwater locomotion: Verification with deep reinforcement learning.

Front Robot AI. 2022 Sep 8;9:957931. doi: 10.3389/frobt.2022.957931. eCollection 2022.

Creating Better Collision-Free Trajectory for Robot Motion Planning by Linearly Constrained Quadratic Programming.

Front Neurorobot. 2021 Aug 9;15:724116. doi: 10.3389/fnbot.2021.724116. eCollection 2021.

Domain Adaptation for Imitation Learning Using Generative Adversarial Network.

Sensors (Basel). 2021 Jul 9;21(14):4718. doi: 10.3390/s21144718.

Spatial Topological Relation Analysis for Cluttered Scenes.

Sensors (Basel). 2020 Dec 15;20(24):7181. doi: 10.3390/s20247181.

Attitude Trajectory Optimization to Ensure Balance Hexapod Locomotion.

Sensors (Basel). 2020 Nov 5;20(21):6295. doi: 10.3390/s20216295.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于强化学习和逆强化学习的蛇形机器人节能与损伤恢复蠕动步态设计。

Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献