• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于强化学习的图形博弈中异构多智能体系统的数据驱动最优同步

Data-Based Optimal Synchronization of Heterogeneous Multiagent Systems in Graphical Games via Reinforcement Learning.

作者信息

Xiong Chunping, Ma Qian, Guo Jian, Lewis Frank L

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):15984-15992. doi: 10.1109/TNNLS.2023.3291542. Epub 2024 Oct 29.

DOI:10.1109/TNNLS.2023.3291542
PMID:37463077
Abstract

This article studies the optimal synchronization of linear heterogeneous multiagent systems (MASs) with partial unknown knowledge of the system dynamics. The object is to realize system synchronization as well as minimize the performance index of each agent. A framework of heterogeneous multiagent graphical games is formulated first. In the graphical games, it is proved that the optimal control policy relying on the solution of the Hamilton-Jacobian-Bellmen (HJB) equation is not only in Nash equilibrium, but also the best response to fixed control policies of its neighbors. To solve the optimal control policy and the minimum value of the performance index, a model-based policy iteration (PI) algorithm is proposed. Then, according to the model-based algorithm, a data-based off-policy integral reinforcement learning (IRL) algorithm is put forward to handle the partially unknown system dynamics. Furthermore, a single-critic neural network (NN) structure is used to implement the data-based algorithm. Based on the data collected by the behavior policy of the data-based off-policy algorithm, the gradient descent method is used to train NNs to approach the ideal weights. In addition, it is proved that all the proposed algorithms are convergent, and the weight-tuning law of the single-critic NNs can promote optimal synchronization. Finally, a numerical example is proposed to show the effectiveness of the theoretical analysis.

摘要

本文研究了在系统动力学部分未知的情况下线性异构多智能体系统(MAS)的最优同步问题。目标是实现系统同步,并使每个智能体的性能指标最小化。首先构建了一个异构多智能体图形博弈框架。在图形博弈中,证明了依赖于汉密尔顿 - 雅可比 - 贝尔曼(HJB)方程解的最优控制策略不仅处于纳什均衡,而且是对其邻居固定控制策略的最佳响应。为了解决最优控制策略和性能指标的最小值问题,提出了一种基于模型的策略迭代(PI)算法。然后,根据基于模型的算法,提出了一种基于数据的离策略积分强化学习(IRL)算法来处理部分未知的系统动力学。此外,使用单批评神经网络(NN)结构来实现基于数据的算法。基于基于数据的离策略算法的行为策略收集的数据,采用梯度下降法训练神经网络以逼近理想权重。另外,证明了所有提出的算法都是收敛的,并且单批评神经网络的权重调整律可以促进最优同步。最后,给出了一个数值例子来说明理论分析的有效性。

相似文献

1
Data-Based Optimal Synchronization of Heterogeneous Multiagent Systems in Graphical Games via Reinforcement Learning.基于强化学习的图形博弈中异构多智能体系统的数据驱动最优同步
IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):15984-15992. doi: 10.1109/TNNLS.2023.3291542. Epub 2024 Oct 29.
2
Model-Free Reinforcement Learning for Fully Cooperative Consensus Problem of Nonlinear Multiagent Systems.用于非线性多智能体系统完全协作一致性问题的无模型强化学习
IEEE Trans Neural Netw Learn Syst. 2022 Apr;33(4):1482-1491. doi: 10.1109/TNNLS.2020.3042508. Epub 2022 Apr 4.
3
Off-Policy Reinforcement Learning for Synchronization in Multiagent Graphical Games.多智能体图博弈中的非策略强化学习同步。
IEEE Trans Neural Netw Learn Syst. 2017 Oct;28(10):2434-2445. doi: 10.1109/TNNLS.2016.2609500. Epub 2017 Apr 17.
4
Cooperative Differential Game-Based Distributed Optimal Synchronization Control of Heterogeneous Nonlinear Multiagent Systems.基于合作微分博弈的异构非线性多智能体系统分布式最优同步控制
IEEE Trans Cybern. 2023 Dec;53(12):7933-7942. doi: 10.1109/TCYB.2023.3240983. Epub 2023 Nov 29.
5
Optimal Synchronization Control of Multiagent Systems With Input Saturation via Off-Policy Reinforcement Learning.基于离策略强化学习的具有输入饱和的多智能体系统最优同步控制
IEEE Trans Neural Netw Learn Syst. 2019 Jan;30(1):85-96. doi: 10.1109/TNNLS.2018.2832025. Epub 2018 May 24.
6
Off-Policy Integral Reinforcement Learning Method to Solve Nonlinear Continuous-Time Multiplayer Nonzero-Sum Games.基于非策略积分的强化学习方法求解非线性连续时间多人非零和博弈
IEEE Trans Neural Netw Learn Syst. 2017 Mar;28(3):704-713. doi: 10.1109/TNNLS.2016.2582849. Epub 2016 Jul 20.
7
Hierarchical Optimal Synchronization for Linear Systems via Reinforcement Learning: A Stackelberg-Nash Game Perspective.基于强化学习的线性系统分层最优同步:斯塔克尔伯格 - 纳什博弈视角
IEEE Trans Neural Netw Learn Syst. 2021 Apr;32(4):1600-1611. doi: 10.1109/TNNLS.2020.2985738. Epub 2021 Apr 2.
8
Integral Reinforcement-Learning-Based Optimal Containment Control for Partially Unknown Nonlinear Multiagent Systems.基于积分强化学习的部分未知非线性多智能体系统最优遏制控制
Entropy (Basel). 2023 Jan 23;25(2):221. doi: 10.3390/e25020221.
9
A Novel Mean-Field-Game-Type Optimal Control for Very Large-Scale Multiagent Systems.一种用于大规模多智能体系统的新型平均场博弈型最优控制。
IEEE Trans Cybern. 2022 Jun;52(6):5197-5208. doi: 10.1109/TCYB.2020.3028267. Epub 2022 Jun 16.
10
Finite-Horizon Optimal Consensus Control for Unknown Multiagent State-Delay Systems.有限时域最优共识控制的未知多智能体时滞系统。
IEEE Trans Cybern. 2020 Feb;50(2):402-413. doi: 10.1109/TCYB.2018.2856510. Epub 2018 Sep 10.