• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

未知非线性系统非零和博弈的事件触发自适应动态规划

Event-Triggered ADP for Nonzero-Sum Games of Unknown Nonlinear Systems.

作者信息

Zhao Qingtao, Sun Jian, Wang Gang, Chen Jie

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 May;33(5):1905-1913. doi: 10.1109/TNNLS.2021.3071545. Epub 2022 May 2.

DOI:10.1109/TNNLS.2021.3071545
PMID:33882002
Abstract

For nonzero-sum (NZS) games of nonlinear systems, reinforcement learning (RL) or adaptive dynamic programming (ADP) has shown its capability of approximating the desired index performance and the optimal input policy iteratively. In this article, an event-triggered ADP is proposed for NZS games of continuous-time nonlinear systems with completely unknown system dynamics. To achieve the Nash equilibrium solution approximately, the critic neural networks and actor neural networks are utilized to estimate the value functions and the control policies, respectively. Compared with the traditional time-triggered mechanism, the proposed algorithm updates the neural network weights as well as the inputs of players only when a state-based event-triggered condition is violated. It is shown that the system stability and the weights' convergence are still guaranteed under mild assumptions, while occupation of communication and computation resources is considerably reduced. Meanwhile, the infamous Zeno behavior is excluded by proving the existence of a minimum inter-event time (MIET) to ensure the feasibility of the closed-loop event-triggered continuous-time system. Finally, a numerical example is simulated to illustrate the effectiveness of the proposed approach.

摘要

对于非线性系统的非零和(NZS)博弈,强化学习(RL)或自适应动态规划(ADP)已展现出其迭代逼近期望指标性能和最优输入策略的能力。本文针对系统动力学完全未知的连续时间非线性系统的NZS博弈,提出了一种事件触发ADP方法。为了近似获得纳什均衡解,分别利用评判神经网络和执行神经网络来估计值函数和控制策略。与传统的时间触发机制相比,所提算法仅在基于状态的事件触发条件被违反时,才更新神经网络权重以及参与者的输入。结果表明,在温和假设下仍能保证系统稳定性和权重收敛,同时显著减少了通信和计算资源的占用。此外,通过证明最小事件间时间(MIET)的存在,排除了声名狼藉的芝诺行为,以确保闭环事件触发连续时间系统的可行性。最后,通过一个数值例子仿真说明了所提方法的有效性。

相似文献

1
Event-Triggered ADP for Nonzero-Sum Games of Unknown Nonlinear Systems.未知非线性系统非零和博弈的事件触发自适应动态规划
IEEE Trans Neural Netw Learn Syst. 2022 May;33(5):1905-1913. doi: 10.1109/TNNLS.2021.3071545. Epub 2022 May 2.
2
Decentralized Event-Triggered Adaptive Control of Discrete-Time Nonzero-Sum Games Over Wireless Sensor-Actuator Networks With Input Constraints.具有输入约束的无线传感器-执行器网络上离散时间非零和博弈的分布式事件触发自适应控制
IEEE Trans Neural Netw Learn Syst. 2020 Oct;31(10):4254-4266. doi: 10.1109/TNNLS.2019.2953613. Epub 2020 Jan 13.
3
Data-Based Reinforcement Learning for Nonzero-Sum Games With Unknown Drift Dynamics.具有未知漂移动态的非零和博弈的基于数据的强化学习
IEEE Trans Cybern. 2019 Aug;49(8):2874-2885. doi: 10.1109/TCYB.2018.2830820. Epub 2018 May 16.
4
Approximate Optimal Distributed Control of Nonlinear Interconnected Systems Using Event-Triggered Nonzero-Sum Games.基于事件触发非零和博弈的非线性互联系统近似最优分布式控制
IEEE Trans Neural Netw Learn Syst. 2019 May;30(5):1512-1522. doi: 10.1109/TNNLS.2018.2869896. Epub 2018 Oct 8.
5
Near-Optimal Control for Nonzero-Sum Differential Games of Continuous-Time Nonlinear Systems Using Single-Network ADP.基于单神经网络 ADP 的连续时间非线性系统非零和微分对策的近最优控制
IEEE Trans Cybern. 2013 Feb;43(1):206-16. doi: 10.1109/TSMCB.2012.2203336. Epub 2012 Jun 28.
6
Off-Policy Integral Reinforcement Learning Method to Solve Nonlinear Continuous-Time Multiplayer Nonzero-Sum Games.基于非策略积分的强化学习方法求解非线性连续时间多人非零和博弈
IEEE Trans Neural Netw Learn Syst. 2017 Mar;28(3):704-713. doi: 10.1109/TNNLS.2016.2582849. Epub 2016 Jul 20.
7
Experience Replay for Optimal Control of Nonzero-Sum Game Systems With Unknown Dynamics.具有未知动态的非零和博弈系统最优控制的经验回放。
IEEE Trans Cybern. 2016 Mar;46(3):854-65. doi: 10.1109/TCYB.2015.2488680. Epub 2015 Oct 26.
8
Asynchronous learning for actor-critic neural networks and synchronous triggering for multiplayer system.异步学习的演员-批评神经网络和同步触发的多人系统。
ISA Trans. 2022 Oct;129(Pt B):295-308. doi: 10.1016/j.isatra.2022.02.007. Epub 2022 Feb 10.
9
Observer-based event-triggered control for zero-sum games of input constrained multi-player nonlinear systems.基于观测器的事件触发控制用于输入受限多玩家非线性系统的零和博弈。
Neural Netw. 2021 Dec;144:101-112. doi: 10.1016/j.neunet.2021.08.012. Epub 2021 Aug 25.
10
Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms.基于策略迭代的自适应动态规划算法的多人非零和离散时间博弈。
IEEE Trans Cybern. 2017 Oct;47(10):3331-3340. doi: 10.1109/TCYB.2016.2611613. Epub 2016 Oct 3.