Suppr超能文献

具有可调收敛速度的离散时间非线性零和博弈的神经 Q 学习。

Neural Q-learning for discrete-time nonlinear zero-sum games with adjustable convergence rate.

机构信息

Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China; Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing 100124, China.

出版信息

Neural Netw. 2024 Jul;175:106274. doi: 10.1016/j.neunet.2024.106274. Epub 2024 Mar 27.

Abstract

In this paper, an adjustable Q-learning scheme is developed to solve the discrete-time nonlinear zero-sum game problem, which can accelerate the convergence rate of the iterative Q-function sequence. First, the monotonicity and convergence of the iterative Q-function sequence are analyzed under some conditions. Moreover, by employing neural networks, the model-free tracking control problem can be overcome for zero-sum games. Second, two practical algorithms are designed to guarantee the convergence with accelerated learning. In one algorithm, an adjustable acceleration phase is added to the iteration process of Q-learning, which can be adaptively terminated with convergence guarantee. In another algorithm, a novel acceleration function is developed, which can adjust the relaxation factor to ensure the convergence. Finally, through a simulation example with the practical physical background, the fantastic performance of the developed algorithm is demonstrated with neural networks.

摘要

本文提出了一种可调整的 Q 学习方案,以解决离散时间非线性零和博弈问题,从而加快迭代 Q 函数序列的收敛速度。首先,在一些条件下分析了迭代 Q 函数序列的单调性和收敛性。此外,通过使用神经网络,可以解决零和博弈的无模型跟踪控制问题。其次,设计了两种实用算法来保证具有加速学习的收敛性。在一个算法中,在 Q 学习的迭代过程中添加了可调整的加速阶段,可以自适应地终止以保证收敛。在另一个算法中,开发了一种新的加速函数,可以调整松弛因子以确保收敛。最后,通过具有实际物理背景的仿真示例,展示了神经网络中开发算法的出色性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验