快速且 ε-最优离散化追踪学习自动机。

Fast and Epsilon-Optimal Discretized Pursuit Learning Automata.

出版信息

IEEE Trans Cybern. 2015 Oct;45(10):2089-99. doi: 10.1109/TCYB.2014.2365463. Epub 2014 Nov 13.

DOI:10.1109/TCYB.2014.2365463

Abstract

Learning automata (LA) are powerful tools for reinforcement learning. A discretized pursuit LA is the most popular one among them. During an iteration its operation consists of three basic phases: 1) selecting the next action; 2) finding the optimal estimated action; and 3) updating the state probability. However, when the number of actions is large, the learning becomes extremely slow because there are too many updates to be made at each iteration. The increased updates are mostly from phases 1 and 3. A new fast discretized pursuit LA with assured ε -optimality is proposed to perform both phases 1 and 3 with the computational complexity independent of the number of actions. Apart from its low computational complexity, it achieves faster convergence speed than the classical one when operating in stationary environments. This paper can promote the applications of LA toward the large-scale-action oriented area that requires efficient reinforcement learning tools with assured ε -optimality, fast convergence speed, and low computational complexity for each iteration.

摘要

学习自动机（LA）是强化学习的有力工具。在它们中，离散追踪 LA 是最受欢迎的一种。在一次迭代中，它的操作包括三个基本阶段：1）选择下一个动作；2）找到最优估计动作；3）更新状态概率。然而，当动作数量很大时，学习会变得非常缓慢，因为在每次迭代中都有太多的更新要做。增加的更新主要来自第 1 阶段和第 3 阶段。本文提出了一种新的快速离散追踪 LA，具有保证的 ε 最优性，可以以与动作数量无关的计算复杂度来执行第 1 阶段和第 3 阶段。除了计算复杂度低之外，它在静态环境下的收敛速度也比经典方法快。本文可以促进 LA 在面向大规模动作的领域的应用，这些领域需要具有保证 ε 最优性、快速收敛速度和低计算复杂度的高效强化学习工具，每个迭代的计算复杂度都要低。

相似文献

Fast and Epsilon-Optimal Discretized Pursuit Learning Automata.

IEEE Trans Cybern. 2015 Oct;45(10):2089-99. doi: 10.1109/TCYB.2014.2365463. Epub 2014 Nov 13.

Last-position elimination-based learning automata.

IEEE Trans Cybern. 2014 Dec;44(12):2484-92. doi: 10.1109/TCYB.2014.2309478. Epub 2014 Apr 2.

The Hierarchical Discrete Pursuit Learning Automaton: A Novel Scheme With Fast Convergence and Epsilon-Optimality.

IEEE Trans Neural Netw Learn Syst. 2024 Jun;35(6):8278-8292. doi: 10.1109/TNNLS.2022.3226538. Epub 2024 Jun 3.

Generalized pursuit learning schemes: new families of continuous and discretized learning automata.

IEEE Trans Syst Man Cybern B Cybern. 2002;32(6):738-49. doi: 10.1109/TSMCB.2002.1049608.

Continuous and discretized pursuit learning schemes: various algorithms and their comparison.

IEEE Trans Syst Man Cybern B Cybern. 2001;31(3):277-87. doi: 10.1109/3477.931507.

The Hierarchical Continuous Pursuit Learning Automation: A Novel Scheme for Environments With Large Numbers of Actions.

IEEE Trans Neural Netw Learn Syst. 2020 Feb;31(2):512-526. doi: 10.1109/TNNLS.2019.2905162. Epub 2019 Apr 11.

A new class of epsilon-optimal learning automata.

IEEE Trans Syst Man Cybern B Cybern. 2004 Feb;34(1):246-54. doi: 10.1109/tsmcb.2003.811117.

Varieties of learning automata: an overview.

IEEE Trans Syst Man Cybern B Cybern. 2002;32(6):711-22. doi: 10.1109/TSMCB.2002.1049606.

Finite time analysis of the pursuit algorithm for learning automata.

IEEE Trans Syst Man Cybern B Cybern. 1996;26(4):590-8. doi: 10.1109/3477.517033.

Learning Automata-Based Multiagent Reinforcement Learning for Optimization of Cooperative Tasks.

IEEE Trans Neural Netw Learn Syst. 2021 Oct;32(10):4639-4652. doi: 10.1109/TNNLS.2020.3025711. Epub 2021 Oct 5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

快速且 ε-最优离散化追踪学习自动机。

Fast and Epsilon-Optimal Discretized Pursuit Learning Automata.

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献