Suppr超能文献

通过深度强化学习的仿生水下航行器目标跟踪控制。

Target Tracking Control of a Biomimetic Underwater Vehicle Through Deep Reinforcement Learning.

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Aug;33(8):3741-3752. doi: 10.1109/TNNLS.2021.3054402. Epub 2022 Aug 3.

Abstract

In this article, the underwater target tracking control problem of a biomimetic underwater vehicle (BUV) is addressed. Since it is difficult to build an effective mathematic model of a BUV due to the uncertainty of hydrodynamics, target tracking control is converted into the Markov decision process and is further achieved via deep reinforcement learning. The system state and reward function of underwater target tracking control are described. Based on the actor-critic reinforcement learning framework, the deep deterministic policy gradient actor-critic algorithm with supervision controller is proposed. The training tricks, including prioritized experience replay, actor network indirect supervision training, target network updating with different periods, and expansion of exploration space by applying random noise, are presented. Indirect supervision training is designed to address the issues of low stability and slow convergence of reinforcement learning in the continuous state and action space. Comparative simulations are performed to show the effectiveness of the training tricks. Finally, the proposed actor-critic reinforcement learning algorithm with supervision controller is applied to the physical BUV. Swimming pool experiments of underwater object tracking of the BUV are conducted in multiple scenarios to verify the effectiveness and robustness of the proposed method.

摘要

本文针对仿生水下机器人(BUV)的水下目标跟踪控制问题进行了研究。由于 BUV 的水动力不确定性,很难建立有效的数学模型,因此将目标跟踪控制转化为马尔可夫决策过程,并进一步通过深度强化学习来实现。描述了水下目标跟踪控制的系统状态和奖励函数。基于演员-评论家强化学习框架,提出了带有监督控制器的深度确定性策略梯度演员-评论家算法。提出了训练技巧,包括优先经验回放、演员网络间接监督训练、目标网络以不同周期更新以及通过应用随机噪声扩展探索空间。间接监督训练旨在解决强化学习在连续状态和动作空间中稳定性低和收敛慢的问题。进行了比较仿真,以展示训练技巧的有效性。最后,将带有监督控制器的演员-评论家强化学习算法应用于物理 BUV。在多个场景中进行了 BUV 的水下目标跟踪游泳池实验,以验证所提出方法的有效性和鲁棒性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验