具有多玩家的大规模非线性系统资源感知近似最优控制的微分博弈。

Differential-game for resource aware approximate optimal control of large-scale nonlinear systems with multiple players.

机构信息

555 Engineering North, Division of Engineering Technology, Oklahoma State University, Stillwater, OK 74078, United States of America.

Washington University, St. Louis, MO, United States of America.

出版信息

Neural Netw. 2020 Apr;124:95-108. doi: 10.1016/j.neunet.2019.12.031. Epub 2020 Jan 14.

DOI:10.1016/j.neunet.2019.12.031

PMID:31986447

Abstract

In this paper, we propose a novel differential-game based neural network (NN) control architecture to solve an optimal control problem for a class of large-scale nonlinear systems involving N-players. We focus on optimizing the usage of the computational resources along with the system performance simultaneously. In particular, the N-players' control policies are desired to be designed such that they cooperatively optimize the large-scale system performance, and the sampling intervals for each player are desired to reduce the frequency of feedback execution. To develop a unified design framework that achieves both these objectives, we propose an optimal control problem by integrating both the design requirements, which leads to a multi-player differential-game. A solution to this problem is numerically obtained by solving the associated Hamilton-Jacobi (HJ) equation using event-driven approximate dynamic programming (E-ADP) and artificial NNs online and forward-in-time. We employ the critic neural networks to approximate the solution to the HJ equation, i.e., the optimal value function, with aperiodically available feedback information. Using the NN approximated value function, we design the control policies and the sampling schemes. Finally, the event-driven N-player system is remodeled as a hybrid dynamical system with impulsive weight update rules for analyzing its stability and convergence properties. The closed-loop practical stability of the system and Zeno free behavior of the sampling scheme are demonstrated using the Lyapunov method. Simulation results using a numerical example are also included to substantiate the analytical results.

摘要

在本文中，我们提出了一种新的基于微分博弈的神经网络（NN）控制架构，用于解决一类涉及 N 个参与者的大规模非线性系统的最优控制问题。我们专注于同时优化计算资源的使用和系统性能。特别是，希望设计 N 个参与者的控制策略，使它们能够协作地优化大规模系统性能，并且希望每个参与者的采样间隔减少反馈执行的频率。为了开发一个同时实现这两个目标的统一设计框架，我们通过集成设计要求来提出一个最优控制问题，这导致了一个多玩家微分博弈。通过使用事件驱动近似动态规划（E-ADP）和人工神经网络在线和正向时间求解相关的哈密顿-雅可比（HJ）方程，数值上得到了这个问题的解。我们使用评论家神经网络来近似 HJ 方程的解，即最优值函数，利用不定期可用的反馈信息。利用 NN 近似值函数，我们设计了控制策略和采样方案。最后，将事件驱动的 N 个玩家系统建模为具有脉冲权重更新规则的混合动态系统，用于分析其稳定性和收敛性。使用李雅普诺夫方法证明了系统的闭环实际稳定性和采样方案的零阻尼行为。还包括使用数值示例的仿真结果来证实分析结果。

相似文献

Differential-game for resource aware approximate optimal control of large-scale nonlinear systems with multiple players.具有多玩家的大规模非线性系统资源感知近似最优控制的微分博弈。

Neural Netw. 2020 Apr;124:95-108. doi: 10.1016/j.neunet.2019.12.031. Epub 2020 Jan 14.

Approximate Optimal Distributed Control of Nonlinear Interconnected Systems Using Event-Triggered Nonzero-Sum Games.基于事件触发非零和博弈的非线性互联系统近似最优分布式控制

IEEE Trans Neural Netw Learn Syst. 2019 May;30(5):1512-1522. doi: 10.1109/TNNLS.2018.2869896. Epub 2018 Oct 8.

Optimization of sampling intervals for tracking control of nonlinear systems: A game theoretic approach.优化非线性系统跟踪控制的采样间隔：一种博弈论方法。

Neural Netw. 2019 Jun;114:78-90. doi: 10.1016/j.neunet.2019.02.008. Epub 2019 Mar 8.

Event-driven H control with critic learning for nonlinear systems.事件驱动的 H 控制与非线性系统的批评学习。

Neural Netw. 2020 Dec;132:30-42. doi: 10.1016/j.neunet.2020.08.004. Epub 2020 Aug 20.

Observer-based event-triggered control for zero-sum games of input constrained multi-player nonlinear systems.基于观测器的事件触发控制用于输入受限多玩家非线性系统的零和博弈。

Neural Netw. 2021 Dec;144:101-112. doi: 10.1016/j.neunet.2021.08.012. Epub 2021 Aug 25.

Event-triggered distributed zero-sum differential game for nonlinear multi-agent systems using adaptive dynamic programming.基于自适应动态规划的非线性多智能体系统事件触发分布式零和微分博弈

ISA Trans. 2021 Apr;110:39-52. doi: 10.1016/j.isatra.2020.10.043. Epub 2020 Oct 15.

Experience Replay for Optimal Control of Nonzero-Sum Game Systems With Unknown Dynamics.具有未知动态的非零和博弈系统最优控制的经验回放。

IEEE Trans Cybern. 2016 Mar;46(3):854-65. doi: 10.1109/TCYB.2015.2488680. Epub 2015 Oct 26.

Event-Driven Off-Policy Reinforcement Learning for Control of Interconnected Systems.事件驱动的非策略强化学习在互联系统控制中的应用。

IEEE Trans Cybern. 2022 Mar;52(3):1936-1946. doi: 10.1109/TCYB.2020.2991166. Epub 2022 Mar 11.

Approximate N-Player Nonzero-Sum Game Solution for an Uncertain Continuous Nonlinear System.不确定连续非线性系统的近似 N 人非零和博弈解。

IEEE Trans Neural Netw Learn Syst. 2015 Aug;26(8):1645-58. doi: 10.1109/TNNLS.2014.2350835. Epub 2014 Oct 8.

Event-triggered integral reinforcement learning for nonzero-sum games with asymmetric input saturation.具有非零和博弈的事件触发积分强化学习与非对称输入饱和

Neural Netw. 2022 Aug;152:212-223. doi: 10.1016/j.neunet.2022.04.013. Epub 2022 Apr 21.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

具有多玩家的大规模非线性系统资源感知近似最优控制的微分博弈。

Differential-game for resource aware approximate optimal control of large-scale nonlinear systems with multiple players.

机构信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献