• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于强化学习的切换拓扑下异构多智能体系统的数据驱动H∞输出一致性

Data-Driven H∞ Output Consensus for Heterogeneous Multiagent Systems Under Switching Topology via Reinforcement Learning.

作者信息

Liu Qiwei, Yan Huaicheng, Zhang Hao, Wang Meng, Tian Yongxiao

出版信息

IEEE Trans Cybern. 2024 Dec;54(12):7865-7876. doi: 10.1109/TCYB.2024.3419056. Epub 2024 Nov 27.

DOI:10.1109/TCYB.2024.3419056
PMID:39120994
Abstract

In this article, a novel model-free policy gradient reinforcement learning algorithm is proposed to solve the tracking problem for discrete-time heterogeneous multiagent systems with external disturbances over switching topology. The dynamics of the followers and the leader are unknown, and the leader's information is missing for each agent due to the switching topology. Therefore, a distributed adaptive observer is introduced to learn the leader's dynamic model and estimate its state for each agent. For the tracking problem, an exponential discount value function is established and the related discrete-time game algebraic Riccati equation (DTGARE) is derived, which is the key to obtaining the control strategy. Furthermore, a data-based policy gradient algorithm is proposed to approximate the solution of the GAREs online and the utilization of agents' accurate knowledge is avoided. To improve the efficiency of data utilization, an offline dataset and the experience replay scheme are used. In addition, the lower bound of the exponential discount value is explored to ensure the stability of the systems. In the end, a simulation is provided to show the validity of the proposed method.

摘要

本文提出了一种新颖的无模型策略梯度强化学习算法,以解决具有外部干扰的离散时间异构多智能体系统在切换拓扑下的跟踪问题。跟随者和领导者的动态特性未知,并且由于切换拓扑,每个智能体无法获取领导者的信息。因此,引入了分布式自适应观测器来学习领导者的动态模型并估计每个智能体的状态。针对跟踪问题,建立了指数折扣值函数并推导了相关的离散时间博弈代数黎卡提方程(DTGARE),这是获得控制策略的关键。此外,提出了一种基于数据的策略梯度算法来在线逼近GAREs的解,避免了对智能体精确知识的依赖。为了提高数据利用效率,使用了离线数据集和经验回放策略。此外,研究了指数折扣值的下界以确保系统的稳定性。最后,通过仿真验证了所提方法的有效性。

相似文献

1
Data-Driven H∞ Output Consensus for Heterogeneous Multiagent Systems Under Switching Topology via Reinforcement Learning.基于强化学习的切换拓扑下异构多智能体系统的数据驱动H∞输出一致性
IEEE Trans Cybern. 2024 Dec;54(12):7865-7876. doi: 10.1109/TCYB.2024.3419056. Epub 2024 Nov 27.
2
Optimal Tracking Control of Heterogeneous MASs Using Event-Driven Adaptive Observer and Reinforcement Learning.基于事件驱动自适应观测器和强化学习的异构多智能体系统最优跟踪控制
IEEE Trans Neural Netw Learn Syst. 2024 Apr;35(4):5577-5587. doi: 10.1109/TNNLS.2022.3208237. Epub 2024 Apr 4.
3
Data-Efficient Off-Policy Learning for Distributed Optimal Tracking Control of HMAS With Unidentified Exosystem Dynamics.
IEEE Trans Neural Netw Learn Syst. 2024 Mar;35(3):3181-3190. doi: 10.1109/TNNLS.2022.3172130. Epub 2024 Feb 29.
4
Leader-Follower Output Synchronization of Linear Heterogeneous Systems With Active Leader Using Reinforcement Learning.使用强化学习的主动领导者的线性异类系统的领导者-跟随者输出同步。
IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2139-2153. doi: 10.1109/TNNLS.2018.2803059.
5
Two-Layer Reinforcement Learning for Output Consensus of Multiagent Systems Under Switching Topology.切换拓扑下多智能体系统输出一致性的双层强化学习
IEEE Trans Cybern. 2024 Sep;54(9):5463-5472. doi: 10.1109/TCYB.2024.3380001. Epub 2024 Aug 26.
6
Neuro-Adaptive Consensus Tracking of Multiagent Systems With a High-Dimensional Leader.多智能体系统的神经自适应共识跟踪与高维领导者。
IEEE Trans Cybern. 2017 Jul;47(7):1730-1742. doi: 10.1109/TCYB.2016.2556002. Epub 2016 May 5.
7
Reinforcement Learning-Based Cooperative Optimal Output Regulation via Distributed Adaptive Internal Model.基于强化学习的分布式自适应内模协同最优输出调节
IEEE Trans Neural Netw Learn Syst. 2022 Oct;33(10):5229-5240. doi: 10.1109/TNNLS.2021.3069728. Epub 2022 Oct 5.
8
Cooperative Differential Game-Based Distributed Optimal Synchronization Control of Heterogeneous Nonlinear Multiagent Systems.基于合作微分博弈的异构非线性多智能体系统分布式最优同步控制
IEEE Trans Cybern. 2023 Dec;53(12):7933-7942. doi: 10.1109/TCYB.2023.3240983. Epub 2023 Nov 29.
9
Leader-Follower Bipartite Output Synchronization on Signed Digraphs Under Adversarial Factors via Data-Based Reinforcement Learning.基于数据强化学习的带对抗因素符号图上的领导者-跟随者二分输出同步
IEEE Trans Neural Netw Learn Syst. 2020 Oct;31(10):4185-4195. doi: 10.1109/TNNLS.2019.2952611. Epub 2019 Dec 11.
10
Data-driven optimal cooperative tracking control for heterogeneous multi-agent systems.异构多智能体系统的数据驱动最优协同跟踪控制
ISA Trans. 2024 Nov;154:23-31. doi: 10.1016/j.isatra.2024.08.026. Epub 2024 Sep 3.