Pagliuca Paolo, Milano Nicola, Nolfi Stefano
Laboratory of Autonomous Robots and Artificial Life, Institute of Cognitive Science and Technologies, National Research Council, Rome, Italy.
Faculty of Computer Science and Engineering, Innopolis University, Innopolis, Russia.
Front Robot AI. 2020 Jul 28;7:98. doi: 10.3389/frobt.2020.00098. eCollection 2020.
We analyze the efficacy of modern neuro-evolutionary strategies for continuous control optimization. Overall, the results collected on a wide variety of qualitatively different benchmark problems indicate that these methods are generally effective and scale well with respect to the number of parameters and the complexity of the problem. Moreover, they are relatively robust with respect to the setting of hyper-parameters. The comparison of the most promising methods indicates that the OpenAI-ES algorithm outperforms or equals the other algorithms on all considered problems. Moreover, we demonstrate how the reward functions optimized for reinforcement learning methods are not necessarily effective for evolutionary strategies and vice versa. This finding can lead to reconsideration of the relative efficacy of the two classes of algorithm since it implies that the comparisons performed to date are biased toward one or the other class.
我们分析了现代神经进化策略用于连续控制优化的功效。总体而言,在各种性质不同的基准问题上收集的结果表明,这些方法通常是有效的,并且在参数数量和问题复杂度方面具有良好的扩展性。此外,它们在超参数设置方面相对稳健。对最有前景的方法进行比较表明,在所有考虑的问题上,OpenAI-ES算法优于或等同于其他算法。此外,我们证明了为强化学习方法优化的奖励函数不一定对进化策略有效,反之亦然。这一发现可能会促使人们重新考虑这两类算法的相对功效,因为这意味着迄今为止所进行的比较偏向于其中一类。