Suppr超能文献

有限时域最优共识控制的未知多智能体时滞系统。

Finite-Horizon Optimal Consensus Control for Unknown Multiagent State-Delay Systems.

出版信息

IEEE Trans Cybern. 2020 Feb;50(2):402-413. doi: 10.1109/TCYB.2018.2856510. Epub 2018 Sep 10.

Abstract

This paper investigates finite-horizon optimal consensus control problem for unknown multiagent systems with state delays. It is well known that optimal consensus control is the solutions to the coupled Hamilton-Jacobi-Bellman (HJB) equations. An off-policy reinforcement learning (RL) algorithm is developed to learn the two-stage optimal consensus solutions to the coupled time-varying HJB equations using the measurable state data instead of the knowledge of the state-delayed system dynamics. Subsequently, for each agent, a single critic neural network (NN) is utilized to approximate the time-varying cost function and help to calculate optimal consensus control policy. Based on the method of weighted residuals, adaptive weight update laws for the critic NNs are proposed. Finally, the simulation results are provided to illustrate the effectiveness of the proposed off-policy RL method.

摘要

本文研究了具有状态时滞的未知多智能体系统的有限时域最优共识控制问题。众所周知,最优共识控制是耦合 Hamilton-Jacobi-Bellman(HJB)方程的解。提出了一种离线强化学习(RL)算法,使用可测量的状态数据而不是状态时滞系统动力学的知识来学习耦合时变 HJB 方程的两阶段最优共识解。随后,对于每个智能体,使用单个评论家神经网络(NN)来近似时变代价函数,并帮助计算最优共识控制策略。基于加权残值法,提出了评论家 NN 的自适应权重更新律。最后,提供了仿真结果以说明所提出的离线 RL 方法的有效性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验