IEEE Trans Neural Netw Learn Syst. 2023 Jul;34(7):3444-3459. doi: 10.1109/TNNLS.2021.3112718. Epub 2023 Jul 6.
The state-of-the-art reinforcement learning (RL) techniques have made innumerable advancements in robot control, especially in combination with deep neural networks (DNNs), known as deep reinforcement learning (DRL). In this article, instead of reviewing the theoretical studies on RL, which were almost fully completed several decades ago, we summarize some state-of-the-art techniques added to commonly used RL frameworks for robot control. We mainly review bioinspired robots (BIRs) because they can learn to locomote or produce natural behaviors similar to animals and humans. With the ultimate goal of practical applications in real world, we further narrow our review scope to techniques that could aid in sim-to-real transfer. We categorized these techniques into four groups: 1) use of accurate simulators; 2) use of kinematic and dynamic models; 3) use of hierarchical and distributed controllers; and 4) use of demonstrations. The purposes of these four groups of techniques are to supply general and accurate environments for RL training, improve sampling efficiency, divide and conquer complex motion tasks and redundant robot structures, and acquire natural skills. We found that, by synthetically using these techniques, it is possible to deploy RL on physical BIRs in actuality.
最先进的强化学习 (RL) 技术在机器人控制方面取得了无数的进展,特别是与深度神经网络 (DNN) 结合使用时,称为深度强化学习 (DRL)。在本文中,我们没有回顾几十年前几乎已经完成的 RL 的理论研究,而是总结了一些添加到常用的机器人控制 RL 框架中的最先进技术。我们主要回顾生物启发机器人 (BIR),因为它们可以学习类似于动物和人类的运动或产生自然行为。考虑到实际应用的最终目标,我们进一步缩小了评论范围,重点介绍了有助于模拟到现实转移的技术。我们将这些技术分为四组:1)使用精确的模拟器;2)使用运动学和动力学模型;3)使用分层和分布式控制器;4)使用演示。这四组技术的目的是为 RL 培训提供通用和准确的环境,提高采样效率,分解和征服复杂的运动任务和冗余的机器人结构,并获得自然技能。我们发现,通过综合使用这些技术,有可能在实际的物理 BIR 上部署 RL。