• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Rminimax:一种最优随机化 MINIMAX 算法。

Rminimax: An Optimally Randomized MINIMAX Algorithm.

出版信息

IEEE Trans Cybern. 2013 Feb;43(1):385-93. doi: 10.1109/TSMCB.2012.2207951. Epub 2012 Aug 6.

DOI:10.1109/TSMCB.2012.2207951
PMID:22893439
Abstract

This paper proposes a simple extension of the celebrated MINIMAX algorithm used in zero-sum two-player games, called Rminimax. The Rminimax algorithm allows controlling the strength of an artificial rival by randomizing its strategy in an optimal way. In particular, the randomized shortest-path framework is applied for biasing the artificial intelligence (AI) adversary toward worse or better solutions, therefore controlling its strength. In other words, our model aims at introducing/implementing bounded rationality to the MINIMAX algorithm. This framework takes into account all possible strategies by computing an optimal tradeoff between exploration (quantified by the entropy spread in the tree) and exploitation (quantified by the expected cost to an end game) of the game tree. As opposed to other tree-exploration techniques, this new algorithm considers complete paths of a tree (strategies) where a given entropy is spread. The optimal randomized strategy is efficiently computed by means of a simple recurrence relation while keeping the same complexity as the original MINIMAX. As a result, the Rminimax implements a nondeterministic strength-adapted AI opponent for board games in a principled way, thus avoiding the assumption of complete rationality. Simulations on two common games show that Rminimax behaves as expected.

摘要

本文提出了一种对著名的零和二人博弈 MINIMAX 算法的简单扩展,称为 Rminimax。Rminimax 算法通过以最优的方式随机化其策略,来控制人工对手的强度。具体来说,随机最短路径框架被用于通过向人工智能(AI)对手偏向更好或更差的解决方案来控制其强度,从而偏向于更差或更好的解决方案。换句话说,我们的模型旨在为 MINIMAX 算法引入/实现有限理性。该框架通过在博弈树的探索(由树中的熵扩散量化)和利用(由最终游戏的预期成本量化)之间进行最优权衡,考虑了所有可能的策略。与其他树探索技术不同,该新算法考虑了树中给定熵分布的完整路径(策略)。通过一个简单的递归关系来有效地计算最优随机策略,同时保持与原始 MINIMAX 相同的复杂性。结果,Rminimax 以一种有原则的方式为棋盘游戏实现了一个不确定强度自适应的 AI 对手,从而避免了完全理性的假设。对两个常见游戏的模拟表明,Rminimax 的表现符合预期。

相似文献

1
Rminimax: An Optimally Randomized MINIMAX Algorithm.Rminimax:一种最优随机化 MINIMAX 算法。
IEEE Trans Cybern. 2013 Feb;43(1):385-93. doi: 10.1109/TSMCB.2012.2207951. Epub 2012 Aug 6.
2
Randomized shortest-path problems: two related models.随机最短路径问题:两个相关模型。
Neural Comput. 2009 Aug;21(8):2363-404. doi: 10.1162/neco.2009.11-07-643.
3
AlphaDDA: strategies for adjusting the playing strength of a fully trained AlphaZero system to a suitable human training partner.AlphaDDA:将完全训练好的AlphaZero系统的游戏强度调整到适合人类训练伙伴的策略。
PeerJ Comput Sci. 2022 Oct 4;8:e1123. doi: 10.7717/peerj-cs.1123. eCollection 2022.
4
Adversarial search by evolutionary computation.基于进化计算的对抗搜索。
Evol Comput. 2001 Fall;9(3):371-85. doi: 10.1162/106365601750406046.
5
Uniqueness of Minimax Strategy in View of Minimum Error Discrimination of Two Quantum States.基于两个量子态的最小错误判别视角下的极小极大策略的唯一性
Entropy (Basel). 2019 Jul 9;21(7):671. doi: 10.3390/e21070671.
6
Online Minimax Q Network Learning for Two-Player Zero-Sum Markov Games.用于两人零和马尔可夫博弈的在线极小极大Q网络学习
IEEE Trans Neural Netw Learn Syst. 2022 Mar;33(3):1228-1241. doi: 10.1109/TNNLS.2020.3041469. Epub 2022 Feb 28.
7
Acquisition of strategies depending on the opponents' competence level.
Arch Psychol (Frankf). 1989;141(2):113-26.
8
Game of strokes: Optimal & conversion strategy algorithms with simulations & application.中风博弈:具有模拟与应用的最优及转换策略算法
Heliyon. 2023 Nov 30;9(12):e23073. doi: 10.1016/j.heliyon.2023.e23073. eCollection 2023 Dec.
9
An Online Minimax Optimal Algorithm for Adversarial Multiarmed Bandit Problem.一种用于对抗性多臂老虎机问题的在线极小极大最优算法。
IEEE Trans Neural Netw Learn Syst. 2018 Nov;29(11):5565-5580. doi: 10.1109/TNNLS.2018.2806006. Epub 2018 Mar 8.
10
A game theory approach to target tracking in sensor networks.一种用于传感器网络中目标跟踪的博弈论方法。
IEEE Trans Syst Man Cybern B Cybern. 2011 Feb;41(1):2-13. doi: 10.1109/TSMCB.2010.2040733. Epub 2010 Feb 25.