Suppr超能文献

基于滚动时域强化学习的无人水面艇轨迹跟踪控制

USV Trajectory Tracking Control Based on Receding Horizon Reinforcement Learning.

作者信息

Wen Yinghan, Chen Yuepeng, Guo Xuan

机构信息

School of Automation, Wuhan University of Technology, Wuhan 430070, China.

School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China.

出版信息

Sensors (Basel). 2024 Apr 26;24(9):2771. doi: 10.3390/s24092771.

Abstract

We present a novel approach for achieving high-precision trajectory tracking control in an unmanned surface vehicle (USV) through utilization of receding horizon reinforcement learning (RHRL). The control architecture for the USV involves a composite of feedforward and feedback components. The feedforward control component is derived directly from the curvature of the reference path and the dynamic model. Feedback control is acquired through application of the RHRL algorithm, effectively addressing the problem of achieving optimal tracking control. The methodology introduced in this paper synergizes with the rolling time domain optimization mechanism, converting the perpetual time domain optimal control predicament into a succession of finite time domain control problems amenable to resolution. In contrast to Lyapunov model predictive control (LMPC) and sliding mode control (SMC), our proposed method employs the RHRL controller, which yields an explicit state feedback control law. This characteristic endows the controller with the dual capabilities of direct offline and online learning deployment. Within each prediction time domain, we employ a time-independent executive-evaluator network structure to glean insights into the optimal value function and control strategy. Furthermore, we substantiate the convergence of the RHRL algorithm in each prediction time domain through rigorous theoretical proof, with concurrent analysis to verify the stability of the closed-loop system. To conclude, USV trajectory control tests are carried out within a simulated environment.

摘要

我们提出了一种通过利用滚动时域强化学习(RHRL)在无人水面舰艇(USV)中实现高精度轨迹跟踪控制的新方法。USV的控制架构包括前馈和反馈组件的组合。前馈控制组件直接从参考路径的曲率和动态模型导出。通过应用RHRL算法获得反馈控制,有效地解决了实现最优跟踪控制的问题。本文介绍的方法与滚动时域优化机制协同工作,将永恒时域最优控制困境转化为一系列易于解决的有限时域控制问题。与李雅普诺夫模型预测控制(LMPC)和滑模控制(SMC)相比,我们提出的方法采用了RHRL控制器,该控制器产生明确的状态反馈控制律。这一特性赋予了控制器直接离线和在线学习部署的双重能力。在每个预测时域内,我们采用与时间无关的执行-评估器网络结构来深入了解最优值函数和控制策略。此外,我们通过严格的理论证明证实了RHRL算法在每个预测时域内的收敛性,并进行了并行分析以验证闭环系统的稳定性。最后,在模拟环境中进行了USV轨迹控制测试。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2837/11086230/d99d644a684c/sensors-24-02771-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验