• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

连续和离散化追踪学习方案:各种算法及其比较

Continuous and discretized pursuit learning schemes: various algorithms and their comparison.

作者信息

Oommen B J, Agache M

机构信息

Sch. of Comput. Sci., Carleton Univ., Ottawa, Ont.

出版信息

IEEE Trans Syst Man Cybern B Cybern. 2001;31(3):277-87. doi: 10.1109/3477.931507.

DOI:10.1109/3477.931507
PMID:18244792
Abstract

A learning automaton (LA) is an automaton that interacts with a random environment, having as its goal the task of learning the optimal action based on its acquired experience. Many learning automata (LAs) have been proposed, with the class of estimator algorithms being among the fastest ones, Thathachar and Sastry, through the pursuit algorithm, introduced the concept of learning algorithms that pursue the current optimal action, following a reward-penalty learning philosophy. Later, Oommen and Lanctot extended the pursuit algorithm into the discretized world by presenting the discretized pursuit algorithm, based on a reward-inaction learning philosophy. In this paper we argue that the reward-penalty and reward-inaction learning paradigms in conjunction with the continuous and discrete models of computation, lead to four versions of pursuit learning automata. We contend that a scheme that merges the pursuit concept with the most recent response of the environment, permits the algorithm to utilize the LAs long-term and short-term perspectives of the environment. In this paper, we present all four resultant pursuit algorithms, prove the E-optimality of the newly introduced algorithms, and present a quantitative comparison between them.

摘要

学习自动机(LA)是一种与随机环境交互的自动机,其目标是基于所获得的经验学习最优动作的任务。已经提出了许多学习自动机(LA),估计器算法类别是其中最快的算法之一,塔哈查尔和萨斯特里通过追踪算法引入了遵循奖惩学习理念追踪当前最优动作的学习算法概念。后来,奥门和兰科托通过提出离散化追踪算法,基于奖惩无为学习理念将追踪算法扩展到离散世界。在本文中,我们认为奖惩和奖惩无为学习范式与连续和离散计算模型相结合,导致了四种版本的追踪学习自动机。我们认为,一种将追踪概念与环境的最新响应相结合的方案,允许算法利用学习自动机对环境的长期和短期视角。在本文中,我们给出了所有四种由此产生的追踪算法,证明了新引入算法的E最优性,并对它们进行了定量比较。

相似文献

1
Continuous and discretized pursuit learning schemes: various algorithms and their comparison.连续和离散化追踪学习方案:各种算法及其比较
IEEE Trans Syst Man Cybern B Cybern. 2001;31(3):277-87. doi: 10.1109/3477.931507.
2
Generalized pursuit learning schemes: new families of continuous and discretized learning automata.广义追踪学习方案:连续和离散学习自动机的新类别
IEEE Trans Syst Man Cybern B Cybern. 2002;32(6):738-49. doi: 10.1109/TSMCB.2002.1049608.
3
Last-position elimination-based learning automata.基于最后位置消除的学习自动机。
IEEE Trans Cybern. 2014 Dec;44(12):2484-92. doi: 10.1109/TCYB.2014.2309478. Epub 2014 Apr 2.
4
Finite time analysis of the pursuit algorithm for learning automata.学习自动机追踪算法的有限时间分析
IEEE Trans Syst Man Cybern B Cybern. 1996;26(4):590-8. doi: 10.1109/3477.517033.
5
Discretized learning automata solutions to the capacity assignment problem for prioritized networks.优先网络容量分配问题的离散学习自动机解决方案。
IEEE Trans Syst Man Cybern B Cybern. 2002;32(6):821-31. doi: 10.1109/TSMCB.2002.1049616.
6
Modeling a student's behavior in a tutorial-like system using learning automata.
IEEE Trans Syst Man Cybern B Cybern. 2010 Apr;40(2):481-92. doi: 10.1109/TSMCB.2009.2027220. Epub 2009 Sep 9.
7
Fast and Epsilon-Optimal Discretized Pursuit Learning Automata.快速且 ε-最优离散化追踪学习自动机。
IEEE Trans Cybern. 2015 Oct;45(10):2089-99. doi: 10.1109/TCYB.2014.2365463. Epub 2014 Nov 13.
8
Random early detection for congestion avoidance in wired networks: a discretized pursuit learning-automata-like solution.用于有线网络拥塞避免的随机早期检测:一种类似离散追踪学习自动机的解决方案。
IEEE Trans Syst Man Cybern B Cybern. 2010 Feb;40(1):66-76. doi: 10.1109/TSMCB.2009.2032363.
9
Varieties of learning automata: an overview.学习自动机的种类:概述
IEEE Trans Syst Man Cybern B Cybern. 2002;32(6):711-22. doi: 10.1109/TSMCB.2002.1049606.
10
A generalized learning algorithm for an automaton operating in a multiteacher environment.一种适用于在多教师环境中运行的自动机的通用学习算法。
IEEE Trans Syst Man Cybern B Cybern. 1999;29(5):592-600. doi: 10.1109/3477.790442.

引用本文的文献

1
A parameter-free learning automaton scheme.一种无参数学习自动机方案。
Front Neurorobot. 2022 Sep 23;16:999658. doi: 10.3389/fnbot.2022.999658. eCollection 2022.