• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于数据驱动的无人水面艇性能规定强化学习控制

Data-Driven Performance-Prescribed Reinforcement Learning Control of an Unmanned Surface Vehicle.

作者信息

Wang Ning, Gao Ying, Zhang Xuefeng

出版信息

IEEE Trans Neural Netw Learn Syst. 2021 Dec;32(12):5456-5467. doi: 10.1109/TNNLS.2021.3056444. Epub 2021 Nov 30.

DOI:10.1109/TNNLS.2021.3056444
PMID:33606641
Abstract

An unmanned surface vehicle (USV) under complicated marine environments can hardly be modeled well such that model-based optimal control approaches become infeasible. In this article, a self-learning-based model-free solution only using input-output signals of the USV is innovatively provided. To this end, a data-driven performance-prescribed reinforcement learning control (DPRLC) scheme is created to pursue control optimality and prescribed tracking accuracy simultaneously. By devising state transformation with prescribed performance, constrained tracking errors are substantially converted into constraint-free stabilization of tracking errors with unknown dynamics. Reinforcement learning paradigm using neural network-based actor-critic learning framework is further deployed to directly optimize controller synthesis deduced from the Bellman error formulation such that transformed tracking errors evolve a data-driven optimal controller. Theoretical analysis eventually ensures that the entire DPRLC scheme can guarantee prescribed tracking accuracy, subject to optimal cost. Both simulations and virtual-reality experiments demonstrate the remarkable effectiveness and superiority of the proposed DPRLC scheme.

摘要

在复杂海洋环境下的无人水面航行器(USV)很难得到很好的建模,以至于基于模型的最优控制方法变得不可行。在本文中,创新性地提出了一种仅使用无人水面航行器输入输出信号的基于自学习的无模型解决方案。为此,创建了一种数据驱动的性能规定强化学习控制(DPRLC)方案,以同时追求控制最优性和规定的跟踪精度。通过设计具有规定性能的状态变换,将受约束的跟踪误差大幅转化为具有未知动态特性的跟踪误差的无约束稳定。进一步采用基于神经网络的演员-评论家学习框架的强化学习范式,直接优化从贝尔曼误差公式推导的控制器综合,从而使变换后的跟踪误差产生一个数据驱动的最优控制器。理论分析最终确保整个DPRLC方案能够在最优成本的前提下保证规定的跟踪精度。仿真和虚拟现实实验均证明了所提出的DPRLC方案具有显著的有效性和优越性。

相似文献

1
Data-Driven Performance-Prescribed Reinforcement Learning Control of an Unmanned Surface Vehicle.基于数据驱动的无人水面艇性能规定强化学习控制
IEEE Trans Neural Netw Learn Syst. 2021 Dec;32(12):5456-5467. doi: 10.1109/TNNLS.2021.3056444. Epub 2021 Nov 30.
2
Reinforcement Learning-Based Optimal Tracking Control of an Unknown Unmanned Surface Vehicle.基于强化学习的未知无人水面舰艇最优跟踪控制
IEEE Trans Neural Netw Learn Syst. 2021 Jul;32(7):3034-3045. doi: 10.1109/TNNLS.2020.3009214. Epub 2021 Jul 6.
3
Adaptive Optimal Surrounding Control of Multiple Unmanned Surface Vessels via Actor-Critic Reinforcement Learning.基于智能体-评论家强化学习的多艘无人水面舰艇自适应最优周边控制
IEEE Trans Neural Netw Learn Syst. 2024 Oct 18;PP. doi: 10.1109/TNNLS.2024.3474289.
4
Adaptive optimal trajectory tracking control of AUVs based on reinforcement learning.基于强化学习的 AUV 自适应最优轨迹跟踪控制。
ISA Trans. 2023 Jun;137:122-132. doi: 10.1016/j.isatra.2022.12.003. Epub 2022 Dec 8.
5
Prescribed Performance Fault-Tolerant Control for Uncertain Nonlinear MIMO System Using Actor-Critic Learning Structure.基于行为-评判学习结构的不确定非线性多输入多输出系统规定性能容错控制
IEEE Trans Neural Netw Learn Syst. 2022 Sep;33(9):4479-4490. doi: 10.1109/TNNLS.2021.3057482. Epub 2022 Aug 31.
6
Path Following Control for Unmanned Surface Vehicles: A Reinforcement Learning-Based Method With Experimental Validation.无人水面舰艇的路径跟踪控制:一种基于强化学习的方法及实验验证
IEEE Trans Neural Netw Learn Syst. 2024 Dec;35(12):18237-18250. doi: 10.1109/TNNLS.2023.3313312. Epub 2024 Dec 2.
7
Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems.基于 actor-critic 的部分未知非线性离散时间系统最优跟踪。
IEEE Trans Neural Netw Learn Syst. 2015 Jan;26(1):140-51. doi: 10.1109/TNNLS.2014.2358227. Epub 2014 Oct 8.
8
Distributed prescribed performance containment control for unmanned surface vehicles based on disturbance observer.基于干扰观测器的无人水面舰艇分布式规定性能约束控制
ISA Trans. 2022 Jun;125:699-706. doi: 10.1016/j.isatra.2021.12.007. Epub 2021 Dec 16.
9
Quantization-Based Adaptive Actor-Critic Tracking Control With Tracking Error Constraints.基于量化的具有跟踪误差约束的自适应动作-评论家跟踪控制。
IEEE Trans Neural Netw Learn Syst. 2018 Apr;29(4):970-980. doi: 10.1109/TNNLS.2017.2651104. Epub 2017 Feb 1.
10
Target Tracking Control of a Biomimetic Underwater Vehicle Through Deep Reinforcement Learning.通过深度强化学习的仿生水下航行器目标跟踪控制。
IEEE Trans Neural Netw Learn Syst. 2022 Aug;33(8):3741-3752. doi: 10.1109/TNNLS.2021.3054402. Epub 2022 Aug 3.

引用本文的文献

1
ADHDP-based robust self-learning 3D trajectory tracking control for underactuated UUVs.基于ADHDP的欠驱动水下航行器鲁棒自学习三维轨迹跟踪控制
PeerJ Comput Sci. 2024 Dec 10;10:e2605. doi: 10.7717/peerj-cs.2605. eCollection 2024.
2
Real-time Trajectory Planning and Tracking Control of Bionic Underwater Robot in Dynamic Environment.动态环境下仿生水下机器人的实时轨迹规划与跟踪控制
Cyborg Bionic Syst. 2024 May 9;5:0112. doi: 10.34133/cbsystems.0112. eCollection 2024.
3
Development of a Sliding-Mode-Control-Based Path-Tracking Algorithm with Model-Free Adaptive Feedback Action for Autonomous Vehicles.
基于模型自由自适应反馈动作的滑模控制路径跟踪算法的开发用于自动驾驶车辆。
Sensors (Basel). 2022 Dec 30;23(1):405. doi: 10.3390/s23010405.