在可变形地形上学习四足运动。

Learning quadrupedal locomotion on deformable terrain.

机构信息

Robotics & Artificial Intelligence Lab, KAIST, Daejeon, Korea.

出版信息

Sci Robot. 2023 Jan 25;8(74):eade2256. doi: 10.1126/scirobotics.ade2256.

Abstract

Simulation-based reinforcement learning approaches are leading the next innovations in legged robot control. However, the resulting control policies are still not applicable on soft and deformable terrains, especially at high speed. The primary reason is that reinforcement learning approaches, in general, are not effective beyond the data distribution: The agent cannot perform well in environments that it has not experienced. To this end, we introduce a versatile and computationally efficient granular media model for reinforcement learning. Our model can be parameterized to represent diverse types of terrain from very soft beach sand to hard asphalt. In addition, we introduce an adaptive control architecture that can implicitly identify the terrain properties as the robot feels the terrain. The identified parameters are then used to boost the locomotion performance of the legged robot. We applied our techniques to the Raibo robot, a dynamic quadrupedal robot developed in-house. The trained networks demonstrated high-speed locomotion capabilities on deformable terrains: The robot was able to run on soft beach sand at 3.03 meters per second although the feet were completely buried in the sand during the stance phase. We also demonstrate its ability to generalize to different terrains by presenting running experiments on vinyl tile flooring, athletic track, grass, and a soft air mattress.

摘要

基于仿真的强化学习方法正在引领腿部机器人控制的下一轮创新。然而，由此产生的控制策略仍然不适用于柔软和可变形的地形，尤其是在高速下。主要原因是强化学习方法通常在数据分布之外效果不佳：代理在它没有经历过的环境中表现不佳。为此，我们引入了一种通用且计算高效的颗粒介质模型用于强化学习。我们的模型可以参数化，以表示从非常柔软的沙滩到坚硬的柏油马路等各种类型的地形。此外，我们引入了一种自适应控制架构，可以在机器人感知地形时隐式地识别地形属性。然后，所识别的参数用于提高腿部机器人的运动性能。我们将我们的技术应用于 Raibo 机器人，这是一个内部开发的动态四足机器人。经过训练的网络在可变形地形上展示了高速运动能力：机器人能够以每秒 3.03 米的速度在柔软的沙滩上运行，尽管在站立阶段脚完全埋在沙子里。我们还通过在乙烯基地板、田径跑道、草地和柔软的气垫床上进行跑步实验，展示了其在不同地形上的泛化能力。