Chen Lin, Zhao Yongting, Zhao Huanjun, Zheng Bin
Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400700, China.
School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China.
Sensors (Basel). 2021 Jan 27;21(3):841. doi: 10.3390/s21030841.
This paper presents a novel decentralized multi-robot collision avoidance method with deep reinforcement learning, which is not only suitable for the large-scale grid map workspace multi-robot system, but also directly processes Lidar signals instead of communicating between the robots. According to the particularity of the workspace, we handcrafted a reward function, which considers both the collision avoidance among the robots and as little as possible change of direction of the robots during driving. Using Double Deep Q-Network (DDQN), the policy was trained in the simulation grid map workspace. By designing experiments, we demonstrated that the learned policy can guide the robot well to effectively travel from the initial position to the goal position in the grid map workspace and to avoid collisions with others while driving.
本文提出了一种基于深度强化学习的新型分布式多机器人避碰方法,该方法不仅适用于大规模网格地图工作空间多机器人系统,而且直接处理激光雷达信号,无需机器人之间进行通信。根据工作空间的特殊性,我们精心设计了一个奖励函数,该函数既考虑了机器人之间的避碰,又考虑了机器人行驶过程中方向变化尽可能小的情况。使用双深度Q网络(DDQN),在模拟网格地图工作空间中对策略进行了训练。通过设计实验,我们证明了学习到的策略能够很好地引导机器人在网格地图工作空间中从初始位置有效地行驶到目标位置,并在行驶过程中避免与其他机器人碰撞。