一种用于对抗性多臂老虎机问题的在线极小极大最优算法。

An Online Minimax Optimal Algorithm for Adversarial Multiarmed Bandit Problem.

作者信息

Gokcesu Kaan, Kozat Suleyman Serdar

出版信息

IEEE Trans Neural Netw Learn Syst. 2018 Nov;29(11):5565-5580. doi: 10.1109/TNNLS.2018.2806006. Epub 2018 Mar 8.

DOI:10.1109/TNNLS.2018.2806006

Abstract

We investigate the adversarial multiarmed bandit problem and introduce an online algorithm that asymptotically achieves the performance of the best switching bandit arm selection strategy. Our algorithms are truly online such that we do not use the game length or the number of switches of the best arm selection strategy in their constructions. Our results are guaranteed to hold in an individual sequence manner, since we have no statistical assumptions on the bandit arm losses. Our regret bounds, i.e., our performance bounds with respect to the best bandit arm selection strategy, are minimax optimal up to logarithmic terms. We achieve the minimax optimal regret with computational complexity only log-linear in the game length. Thus, our algorithms can be efficiently used in applications involving big data. Through an extensive set of experiments involving synthetic and real data, we demonstrate significant performance gains achieved by the proposed algorithm with respect to the state-of-the-art switching bandit algorithms. We also introduce a general efficiently implementable bandit arm selection framework, which can be adapted to various applications.

摘要

我们研究对抗性多臂老虎机问题，并引入一种在线算法，该算法渐近地实现了最佳切换老虎机臂选择策略的性能。我们的算法是真正在线的，以至于在其构造中不使用最佳臂选择策略的博弈长度或切换次数。由于我们对老虎机臂损失没有统计假设，我们的结果保证以个体序列的方式成立。我们的遗憾界，即相对于最佳老虎机臂选择策略的性能界，在对数项范围内是极小极大最优的。我们以仅与博弈长度成对数线性的计算复杂度实现了极小极大最优遗憾。因此，我们的算法可以有效地应用于涉及大数据的应用中。通过一系列涉及合成数据和真实数据的广泛实验，我们证明了所提出的算法相对于现有最先进的切换老虎机算法实现了显著的性能提升。我们还引入了一个通用的可有效实现的老虎机臂选择框架，该框架可适用于各种应用。

相似文献

An Online Minimax Optimal Algorithm for Adversarial Multiarmed Bandit Problem.

IEEE Trans Neural Netw Learn Syst. 2018 Nov;29(11):5565-5580. doi: 10.1109/TNNLS.2018.2806006. Epub 2018 Mar 8.

Asymptotically Optimal Contextual Bandit Algorithm Using Hierarchical Structures.

IEEE Trans Neural Netw Learn Syst. 2019 Mar;30(3):923-937. doi: 10.1109/TNNLS.2018.2854796. Epub 2018 Aug 2.

Online Density Estimation of Nonstationary Sources Using Exponential Family of Distributions.

IEEE Trans Neural Netw Learn Syst. 2018 Sep;29(9):4473-4478. doi: 10.1109/TNNLS.2017.2740003. Epub 2017 Sep 13.

An Optimal Algorithm for the Stochastic Bandits While Knowing the Near-Optimal Mean Reward.

IEEE Trans Neural Netw Learn Syst. 2021 May;32(5):2285-2291. doi: 10.1109/TNNLS.2020.2995920. Epub 2021 May 3.

Overtaking method based on sand-sifter mechanism: Why do optimistic value functions find optimal solutions in multi-armed bandit problems?

Biosystems. 2015 Sep;135:55-65. doi: 10.1016/j.biosystems.2015.06.009. Epub 2015 Jul 10.

Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback.

Neural Comput. 2020 Sep;32(9):1733-1773. doi: 10.1162/neco_a_01299. Epub 2020 Jul 20.

A Thompson Sampling Algorithm With Logarithmic Regret for Unimodal Gaussian Bandit.

IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5332-5341. doi: 10.1109/TNNLS.2023.3295360. Epub 2023 Sep 1.

Multiarmed Bandit Algorithms on Zynq System-on-Chip: Go Frequentist or Bayesian?

IEEE Trans Neural Netw Learn Syst. 2024 Feb;35(2):2602-2615. doi: 10.1109/TNNLS.2022.3190509. Epub 2024 Feb 5.

PAC-Bayes Bounds for Bandit Problems: A Survey and Experimental Comparison.

IEEE Trans Pattern Anal Mach Intell. 2023 Dec;45(12):15308-15327. doi: 10.1109/TPAMI.2023.3305381. Epub 2023 Nov 3.

Greedy Methods, Randomization Approaches, and Multiarm Bandit Algorithms for Efficient Sparsity-Constrained Optimization.

IEEE Trans Neural Netw Learn Syst. 2017 Nov;28(11):2789-2802. doi: 10.1109/TNNLS.2016.2600243. Epub 2016 Sep 9.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于对抗性多臂老虎机问题的在线极小极大最优算法。

An Online Minimax Optimal Algorithm for Adversarial Multiarmed Bandit Problem.

作者信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献