• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于智能体-评论家强化学习的多艘无人水面舰艇自适应最优周边控制

Adaptive Optimal Surrounding Control of Multiple Unmanned Surface Vessels via Actor-Critic Reinforcement Learning.

作者信息

Lu Renzhi, Wang Xiaotao, Ding Yiyu, Zhang Hai-Tao, Zhao Feng, Zhu Lijun, He Yong

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Oct 18;PP. doi: 10.1109/TNNLS.2024.3474289.

DOI:10.1109/TNNLS.2024.3474289
PMID:39423077
Abstract

In this article, an optimal surrounding control algorithm is proposed for multiple unmanned surface vessels (USVs), in which actor-critic reinforcement learning (RL) is utilized to optimize the merging process. Specifically, the multiple-USV optimal surrounding control problem is first transformed into the Hamilton-Jacobi-Bellman (HJB) equation, which is difficult to solve due to its nonlinearity. An adaptive actor-critic RL control paradigm is then proposed to obtain the optimal surround strategy, wherein the Bellman residual error is utilized to construct the network update laws. Particularly, a virtual controller representing intermediate transitions and an actual controller operating on a dynamics model are employed as surrounding control solutions for second-order USVs; thus, optimal surrounding control of the USVs is guaranteed. In addition, the stability of the proposed controller is analyzed by means of Lyapunov theory functions. Finally, numerical simulation results demonstrate that the proposed actor-critic RL-based surrounding controller can achieve the surrounding objective while optimizing the evolution process and obtains 9.76% and 20.85% reduction in trajectory length and energy consumption compared with the existing controller.

摘要

本文针对多艘无人水面舰艇(USV)提出了一种最优环绕控制算法,其中利用 actor-critic 强化学习(RL)来优化合并过程。具体而言,首先将多 USV 最优环绕控制问题转化为汉密尔顿-雅可比-贝尔曼(HJB)方程,由于其非线性,该方程难以求解。然后提出一种自适应 actor-critic RL 控制范式以获得最优环绕策略,其中利用贝尔曼残差误差来构建网络更新律。特别地,将表示中间过渡的虚拟控制器和在动力学模型上运行的实际控制器用作二阶 USV 的环绕控制解决方案;从而保证了 USV 的最优环绕控制。此外,通过李雅普诺夫理论函数分析了所提出控制器的稳定性。最后,数值仿真结果表明,所提出的基于 actor-critic RL 的环绕控制器在优化演化过程的同时能够实现环绕目标,与现有控制器相比,轨迹长度和能耗分别降低了 9.76%和 20.85%。

相似文献

1
Adaptive Optimal Surrounding Control of Multiple Unmanned Surface Vessels via Actor-Critic Reinforcement Learning.基于智能体-评论家强化学习的多艘无人水面舰艇自适应最优周边控制
IEEE Trans Neural Netw Learn Syst. 2024 Oct 18;PP. doi: 10.1109/TNNLS.2024.3474289.
2
Optimal Evolution Strategy for Continuous Strategy Games on Complex Networks via Reinforcement Learning.基于强化学习的复杂网络上连续策略博弈的最优进化策略
IEEE Trans Neural Netw Learn Syst. 2025 Jul;36(7):12827-12839. doi: 10.1109/TNNLS.2024.3453385.
3
Data-based decentralized control of nonlinear-constrained interconnected systems using reinforcement learning.基于数据的强化学习对非线性约束互联系统的分散控制
Neural Netw. 2025 Nov;191:107780. doi: 10.1016/j.neunet.2025.107780. Epub 2025 Jun 30.
4
Online lifelong optimal tracking control of uncertain nonlinear continuous-time strict-feedback systems using deep neural networks.基于深度神经网络的不确定非线性连续时间严格反馈系统的在线终身最优跟踪控制
Neural Netw. 2025 Nov;191:107793. doi: 10.1016/j.neunet.2025.107793. Epub 2025 Jul 5.
5
Event-triggered ADP-based tracking controller for partially unknown nonlinear uncertain systems with input and state constraints.具有输入和状态约束的部分未知非线性不确定系统的事件触发基于自适应动态规划的跟踪控制器
Neural Netw. 2025 Nov;191:107752. doi: 10.1016/j.neunet.2025.107752. Epub 2025 Jun 21.
6
Management of urinary stones by experts in stone disease (ESD 2025).结石病专家对尿路结石的管理(2025年结石病专家共识)
Arch Ital Urol Androl. 2025 Jun 30;97(2):14085. doi: 10.4081/aiua.2025.14085.
7
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
8
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
9
Health professionals' experience of teamwork education in acute hospital settings: a systematic review of qualitative literature.医疗专业人员在急症医院环境中团队合作教育的经验:对定性文献的系统综述
JBI Database System Rev Implement Rep. 2016 Apr;14(4):96-137. doi: 10.11124/JBISRIR-2016-1843.
10
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.