• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于强化学习的未知无人水面舰艇最优跟踪控制

Reinforcement Learning-Based Optimal Tracking Control of an Unknown Unmanned Surface Vehicle.

作者信息

Wang Ning, Gao Ying, Zhao Hong, Ahn Choon Ki

出版信息

IEEE Trans Neural Netw Learn Syst. 2021 Jul;32(7):3034-3045. doi: 10.1109/TNNLS.2020.3009214. Epub 2021 Jul 6.

DOI:10.1109/TNNLS.2020.3009214
PMID:32745008
Abstract

In this article, a novel reinforcement learning-based optimal tracking control (RLOTC) scheme is established for an unmanned surface vehicle (USV) in the presence of complex unknowns, including dead-zone input nonlinearities, system dynamics, and disturbances. To be specific, dead-zone nonlinearities are decoupled to be input-dependent sloped controls and unknown biases that are encapsulated into lumped unknowns within tracking error dynamics. Neural network (NN) approximators are further deployed to adaptively identify complex unknowns and facilitate a Hamilton-Jacobi-Bellman (HJB) equation that formulates optimal tracking. In order to derive a practically optimal solution, an actor-critic reinforcement learning framework is built by employing adaptive NN identifiers to recursively approximate the total optimal policy and cost function. Eventually, theoretical analysis shows that the entire RLOTC scheme can render tracking errors that converge to an arbitrarily small neighborhood of the origin, subject to optimal cost. Simulation results and comprehensive comparisons on a prototype USV demonstrate remarkable effectiveness and superiority.

摘要

在本文中,针对存在复杂未知因素(包括死区输入非线性、系统动力学和干扰)的无人水面舰艇(USV),建立了一种基于新型强化学习的最优跟踪控制(RLOTC)方案。具体而言,死区非线性被解耦为与输入相关的斜率控制和未知偏差,这些偏差被封装到跟踪误差动态中的集总未知量中。进一步部署神经网络(NN)逼近器,以自适应识别复杂未知因素,并推导用于制定最优跟踪的哈密顿-雅可比-贝尔曼(HJB)方程。为了获得实际的最优解,通过采用自适应NN标识符递归逼近总最优策略和成本函数,构建了一个演员-评论家强化学习框架。最终,理论分析表明,在最优成本的约束下,整个RLOTC方案能够使跟踪误差收敛到原点的任意小邻域内。在原型USV上的仿真结果和综合比较证明了该方案具有显著的有效性和优越性。

相似文献

1
Reinforcement Learning-Based Optimal Tracking Control of an Unknown Unmanned Surface Vehicle.基于强化学习的未知无人水面舰艇最优跟踪控制
IEEE Trans Neural Netw Learn Syst. 2021 Jul;32(7):3034-3045. doi: 10.1109/TNNLS.2020.3009214. Epub 2021 Jul 6.
2
Data-Driven Performance-Prescribed Reinforcement Learning Control of an Unmanned Surface Vehicle.基于数据驱动的无人水面艇性能规定强化学习控制
IEEE Trans Neural Netw Learn Syst. 2021 Dec;32(12):5456-5467. doi: 10.1109/TNNLS.2021.3056444. Epub 2021 Nov 30.
3
Adaptive Optimal Tracking Control of an Underactuated Surface Vessel Using Actor-Critic Reinforcement Learning.基于智能体-评判强化学习的欠驱动水面舰艇自适应最优跟踪控制
IEEE Trans Neural Netw Learn Syst. 2024 Jun;35(6):7520-7533. doi: 10.1109/TNNLS.2022.3214681. Epub 2024 Jun 3.
4
Adaptive optimal trajectory tracking control of AUVs based on reinforcement learning.基于强化学习的 AUV 自适应最优轨迹跟踪控制。
ISA Trans. 2023 Jun;137:122-132. doi: 10.1016/j.isatra.2022.12.003. Epub 2022 Dec 8.
5
Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems.基于 actor-critic 的部分未知非线性离散时间系统最优跟踪。
IEEE Trans Neural Netw Learn Syst. 2015 Jan;26(1):140-51. doi: 10.1109/TNNLS.2014.2358227. Epub 2014 Oct 8.
6
Adaptive Optimal Surrounding Control of Multiple Unmanned Surface Vessels via Actor-Critic Reinforcement Learning.基于智能体-评论家强化学习的多艘无人水面舰艇自适应最优周边控制
IEEE Trans Neural Netw Learn Syst. 2024 Oct 18;PP. doi: 10.1109/TNNLS.2024.3474289.
7
Reinforcement Learning-Based Optimal Stabilization for Unknown Nonlinear Systems Subject to Inputs With Uncertain Constraints.基于强化学习的输入不确定约束未知非线性系统最优镇定。
IEEE Trans Neural Netw Learn Syst. 2020 Oct;31(10):4330-4340. doi: 10.1109/TNNLS.2019.2954983. Epub 2019 Dec 27.
8
A policy iteration approach to online optimal control of continuous-time constrained-input systems.一种连续时间约束输入系统在线最优控制的策略迭代方法。
ISA Trans. 2013 Sep;52(5):611-21. doi: 10.1016/j.isatra.2013.04.004. Epub 2013 May 24.
9
Adaptive Neural Network-Based Finite-Time Online Optimal Tracking Control of the Nonlinear System With Dead Zone.基于自适应神经网络的非线性时滞系统有限时间在线最优跟踪控制。
IEEE Trans Cybern. 2021 Jan;51(1):382-392. doi: 10.1109/TCYB.2019.2939424. Epub 2020 Dec 22.
10
Event-Triggered Distributed Control of Nonlinear Interconnected Systems Using Online Reinforcement Learning With Exploration.基于在线强化学习与探索的非线性互联系统事件触发分布式控制
IEEE Trans Cybern. 2018 Sep;48(9):2510-2519. doi: 10.1109/TCYB.2017.2741342. Epub 2017 Sep 7.

引用本文的文献

1
ADHDP-based robust self-learning 3D trajectory tracking control for underactuated UUVs.基于ADHDP的欠驱动水下航行器鲁棒自学习三维轨迹跟踪控制
PeerJ Comput Sci. 2024 Dec 10;10:e2605. doi: 10.7717/peerj-cs.2605. eCollection 2024.
2
Real-time Trajectory Planning and Tracking Control of Bionic Underwater Robot in Dynamic Environment.动态环境下仿生水下机器人的实时轨迹规划与跟踪控制
Cyborg Bionic Syst. 2024 May 9;5:0112. doi: 10.34133/cbsystems.0112. eCollection 2024.
3
Event trigger based adaptive neural trajectory tracking finite time control for underactuated unmanned marine surface vessels with asymmetric input saturation.
基于事件触发的自适应神经网络轨迹跟踪有限时间控制在非对称输入饱和下的欠驱动无人水面船舶。
Sci Rep. 2023 Jun 22;13(1):10126. doi: 10.1038/s41598-023-37331-6.