Robotic Systems Lab, ETH Zurich, Zurich, Switzerland.
Intelligent Systems Lab, Intel, Munich, Germany.
Sci Robot. 2019 Jan 16;4(26). doi: 10.1126/scirobotics.aau5872.
Legged robots pose one of the greatest challenges in robotics. Dynamic and agile maneuvers of animals cannot be imitated by existing methods that are crafted by humans. A compelling alternative is reinforcement learning, which requires minimal craftsmanship and promotes the natural evolution of a control policy. However, so far, reinforcement learning research for legged robots is mainly limited to simulation, and only few and comparably simple examples have been deployed on real systems. The primary reason is that training with real robots, particularly with dynamically balancing systems, is complicated and expensive. In the present work, we introduce a method for training a neural network policy in simulation and transferring it to a state-of-the-art legged system, thereby leveraging fast, automated, and cost-effective data generation schemes. The approach is applied to the ANYmal robot, a sophisticated medium-dog-sized quadrupedal system. Using policies trained in simulation, the quadrupedal machine achieves locomotion skills that go beyond what had been achieved with prior methods: ANYmal is capable of precisely and energy-efficiently following high-level body velocity commands, running faster than before, and recovering from falling even in complex configurations.
腿式机器人是机器人领域面临的最大挑战之一。现有的机器人技术无法模仿动物的动态和敏捷动作,这些动作是由人类精心设计的。一种引人注目的替代方法是强化学习,它需要最少的人工干预,并促进控制策略的自然进化。然而,到目前为止,腿式机器人的强化学习研究主要局限于模拟,只有少数相对简单的例子被部署在真实系统上。主要原因是使用真实机器人进行训练,特别是使用动态平衡系统进行训练,既复杂又昂贵。在本工作中,我们提出了一种在模拟中训练神经网络策略并将其转移到最先进的腿式系统的方法,从而利用快速、自动化和具有成本效益的数据生成方案。该方法应用于 ANYmal 机器人,这是一种复杂的中犬型四足系统。使用在模拟中训练的策略,四足机器人实现了超越以前方法的运动技能:ANYmal 能够精确、节能地跟踪高级身体速度指令,比以前跑得更快,即使在复杂的配置中也能从跌倒中恢复。