离散时间下的一般吉廷斯指数过程。

General Gittins index processes in discrete time.

作者信息

El Karoui N, Karatzas I

机构信息

Laboratoire de Probabilités, Université Pierre et Marie Curie, Paris, France.

出版信息

Proc Natl Acad Sci U S A. 1993 Feb 15;90(4):1232-6. doi: 10.1073/pnas.90.4.1232.

DOI:10.1073/pnas.90.4.1232

PMID:11607366

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC45846/

Abstract

We combine the formulation of Mandelbaum [Mandelbaum, A. (1986) Probab. Theory Rel. Fields 71, 129-147] with ideas from Whittle [Whittle, P. (1980) J. R. Stat. Soc. B 42, 143-149] to obtain a simple and constructive proof for the optimality of Gittins index processes in the general, nonmarkovian dynamic allocation (or "multi-armed bandit") problem. Our approach also provides an explicit expression for the value of this problem.

摘要

我们将曼德尔鲍姆的公式[曼德尔鲍姆，A.（1986年）《概率论及其相关领域》71卷，第129 - 147页]与惠特尔的思想[惠特尔，P.（1980年）《皇家统计学会会刊B辑》42卷，第143 - 149页]相结合，以获得一个简单且具有建设性的证明，用于证明吉廷斯指数过程在一般的非马尔可夫动态分配（或“多臂老虎机”）问题中的最优性。我们的方法还为该问题的值提供了一个明确的表达式。

相似文献

General Gittins index processes in discrete time.离散时间下的一般吉廷斯指数过程。

Proc Natl Acad Sci U S A. 1993 Feb 15;90(4):1232-6. doi: 10.1073/pnas.90.4.1232.

Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges.用于临床试验优化设计的多臂老虎机模型：益处与挑战

Stat Sci. 2015;30(2):199-215. doi: 10.1214/14-STS504.

Response-adaptive randomization for multi-arm clinical trials using the forward looking Gittins index rule.使用前瞻性吉廷斯指数规则进行多臂临床试验的响应自适应随机化。

Biometrics. 2015 Dec;71(4):969-78. doi: 10.1111/biom.12337. Epub 2015 Jun 22.

Covariate-adjusted response-adaptive randomization for multi-arm clinical trials using a modified forward looking Gittins index rule.使用修正的前瞻性吉廷斯指数规则进行多臂临床试验的协变量调整响应自适应随机化。

Biometrics. 2018 Mar;74(1):49-57. doi: 10.1111/biom.12738. Epub 2017 Jul 6.

Bayesian adaptive bandit-based designs using the Gittins index for multi-armed trials with normally distributed endpoints.在具有正态分布终点的多臂试验中，使用吉廷斯指数的基于贝叶斯自适应策略的设计。

J Appl Stat. 2018;45(6):1052-1076. doi: 10.1080/02664763.2017.1342780. Epub 2017 Jun 28.

INDEXABILITY AND OPTIMAL INDEX POLICIES FOR A CLASS OF REINITIALISING RESTLESS BANDITS.一类可重新初始化的不安分多臂老虎机的可索引性与最优索引策略

Probab Eng Inf Sci. 2016 Jan;30(1):1-23. doi: 10.1017/S026996481500025X. Epub 2015 Oct 16.

Overtaking method based on sand-sifter mechanism: Why do optimistic value functions find optimal solutions in multi-armed bandit problems?基于筛沙机制的超越方法：为何乐观值函数能在多臂老虎机问题中找到最优解？

Biosystems. 2015 Sep;135:55-65. doi: 10.1016/j.biosystems.2015.06.009. Epub 2015 Jul 10.

A behavioural Bayes approach to the determination of sample size for clinical trials considering efficacy and safety: imbalanced sample size in treatment groups.一种行为贝叶斯方法在临床试验中用于确定考虑疗效和安全性的样本量：处理组中不平衡的样本量。

Stat Methods Med Res. 2011 Aug;20(4):389-400. doi: 10.1177/0962280209358131. Epub 2010 Mar 11.

Risk-aware multi-armed bandit problem with application to portfolio selection.应用于投资组合选择的风险感知多臂老虎机问题。

R Soc Open Sci. 2017 Nov 15;4(11):171377. doi: 10.1098/rsos.171377. eCollection 2017 Nov.

Optimism in the face of uncertainty supported by a statistically-designed multi-armed bandit algorithm.面对不确定性时的乐观态度由一种经过统计设计的多臂赌博机算法提供支持。

Biosystems. 2017 Oct;160:25-32. doi: 10.1016/j.biosystems.2017.08.004. Epub 2017 Aug 22.

引用本文的文献

In silico study of medical decision-making for rare diseases: heterogeneity of decision-makers in a population improves overall benefit.罕见病医疗决策的计算机模拟研究：人群中决策者的异质性可提高总体效益。

PeerJ. 2018 Sep 25;6:e5677. doi: 10.7717/peerj.5677. eCollection 2018.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验