IEEE Trans Neural Netw Learn Syst. 2021 Sep;32(9):4013-4025. doi: 10.1109/TNNLS.2020.3016523. Epub 2021 Aug 31.
In this article, we present a generic locomotion control framework for legged robots and a strategy for control policy optimization. The framework is based on neural control and black-box optimization. The neural control combines a central pattern generator (CPG) and a radial basis function (RBF) network to create a CPG-RBF network. The control network acts as a neural basis to produce arbitrary rhythmic trajectories for the joints of robots. The main features of the CPG-RBF network are: 1) it is generic since it can be applied to legged robots with different morphologies; 2) it has few control parameters, resulting in fast learning; 3) it is scalable, both in terms of policy/trajectory complexity and the number of legs that can be controlled using similar trajectories; 4) it does not rely heavily on sensory feedback to generate locomotion and is thus less prone to sensory faults; and 5) once trained, it is simple, minimal, and intuitive to use and analyze. These features will lead to an easy-to-use framework with fast convergence and the ability to encode complex locomotion control policies. In this work, we show that the framework can successfully be applied to three different simulated legged robots with varying morphologies and, even broken joints, to learn locomotion control policies. We also show that after learning, the control policies can also be successfully transferred to a real-world robot without any modifications. We, furthermore, show the scalability of the framework by implementing it as a central controller for all legs of a robot and as a decentralized controller for individual legs and leg pairs. By investigating the correlation between robot morphology and encoding type, we are able to present a strategy for control policy optimization. Finally, we show how sensory feedback can be integrated into the CPG-RBF network to enable online adaptation.
在本文中,我们提出了一种通用的腿部机器人运动控制框架和一种控制策略优化策略。该框架基于神经控制和黑盒优化。神经控制结合了中央模式发生器(CPG)和径向基函数(RBF)网络,创建了一个 CPG-RBF 网络。控制网络作为一个神经基础,为机器人的关节产生任意的节奏轨迹。CPG-RBF 网络的主要特点是:1)它是通用的,因为它可以应用于具有不同形态的腿部机器人;2)它的控制参数很少,因此学习速度很快;3)它是可扩展的,无论是在策略/轨迹复杂性方面,还是在可以使用类似轨迹控制的腿的数量方面;4)它不依赖于感觉反馈来产生运动,因此不太容易受到感觉故障的影响;5)一旦训练完成,它就非常简单、直观,易于使用和分析。这些特点将导致一个易于使用的框架,具有快速收敛和编码复杂运动控制策略的能力。在这项工作中,我们表明,该框架可以成功地应用于具有不同形态的三个不同的模拟腿部机器人,甚至是断腿,以学习运动控制策略。我们还表明,在学习之后,控制策略也可以成功地转移到没有任何修改的真实机器人上。我们进一步通过将其实现为机器人所有腿的中央控制器和单个腿和腿对的分散控制器,展示了该框架的可扩展性。通过研究机器人形态和编码类型之间的相关性,我们能够提出一种控制策略优化策略。最后,我们展示了如何将感觉反馈集成到 CPG-RBF 网络中,以实现在线自适应。