Zhao Yujiao, Ma Yong, Hu Songlin
IEEE Trans Neural Netw Learn Syst. 2021 Dec;32(12):5468-5478. doi: 10.1109/TNNLS.2021.3068762. Epub 2021 Nov 30.
This article addresses the problem of path following for underactuated unmanned surface vessels (USVs) formation via a modified deep reinforcement learning with random braking (DRLRB). A formation control model based on deep reinforcement learning (DRL) is constructed to urge USVs to form a preset formation. Specifically, an efficient reward function is designed from the perspective of velocity and error distance of each USV related to the given formation, and then a novel random braking mechanism is formulated to prevent the training of the decision-making network from falling into the local optimum and failing to achieve the training objectives. Following that, a virtual leader-based path-following guidance system is developed for the USV formation problem. Wherein, with the aid of DRLRB, our proposed system can adjust formation automatically and flexibly even when some USVs deviate from the formation. Simulation verifies the effectiveness and superiority of our formation and path-following control strategy.
本文通过一种改进的带随机制动的深度强化学习(DRLRB)来解决欠驱动无人水面舰艇(USV)编队的路径跟踪问题。构建了基于深度强化学习(DRL)的编队控制模型,以促使无人水面舰艇形成预设编队。具体而言,从每个无人水面舰艇相对于给定编队的速度和误差距离的角度设计了一种有效的奖励函数,然后制定了一种新颖的随机制动机制,以防止决策网络的训练陷入局部最优并无法实现训练目标。在此基础上,针对无人水面舰艇编队问题开发了一种基于虚拟领航者的路径跟踪制导系统。其中,借助DRLRB,我们提出的系统即使在一些无人水面舰艇偏离编队时也能自动灵活地调整编队。仿真验证了我们的编队和路径跟踪控制策略的有效性和优越性。