Suppr超能文献

关于最佳臂识别的基于间隙的下界技术

On Gap-Based Lower Bounding Techniques for Best-Arm Identification.

作者信息

Truong Lan V, Scarlett Jonathan

机构信息

Department of Engineering, University of Cambridge, Cambridge CB2 1PZ, UK.

Department of Computer Science & Department of Mathematics, National University of Singapore, Singapore 117418, Singapore.

出版信息

Entropy (Basel). 2020 Jul 20;22(7):788. doi: 10.3390/e22070788.

Abstract

In this paper, we consider techniques for establishing lower bounds on the number of arm pulls for best-arm identification in the multi-armed bandit problem. While a recent divergence-based approach was shown to provide improvements over an older gap-based approach, we show that the latter can be refined to match the former (up to constant factors) in many cases of interest under Bernoulli rewards, including the case that the rewards are bounded away from zero and one. Together with existing upper bounds, this indicates that the divergence-based and gap-based approaches are both effective for establishing sample complexity lower bounds for best-arm identification.

摘要

在本文中,我们考虑在多臂老虎机问题中确定最佳臂识别所需拉臂次数下限的技术。虽然最近基于散度的方法相较于旧的基于差距的方法有改进,但我们表明,在伯努利奖励下的许多感兴趣的情况下,包括奖励远离零和一的情况,后者可以得到改进以与前者匹配(相差常数因子)。结合现有的上限,这表明基于散度的方法和基于差距的方法对于确定最佳臂识别的样本复杂度下限都是有效的。

相似文献

9
PAC-Bayes Bounds for Bandit Problems: A Survey and Experimental Comparison.
IEEE Trans Pattern Anal Mach Intell. 2023 Dec;45(12):15308-15327. doi: 10.1109/TPAMI.2023.3305381. Epub 2023 Nov 3.
10
An Online Minimax Optimal Algorithm for Adversarial Multiarmed Bandit Problem.一种用于对抗性多臂老虎机问题的在线极小极大最优算法。
IEEE Trans Neural Netw Learn Syst. 2018 Nov;29(11):5565-5580. doi: 10.1109/TNNLS.2018.2806006. Epub 2018 Mar 8.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验