Suppr超能文献

用于从观察数据中估计最优动态治疗规则的Q学习法。

Q-learning for estimating optimal dynamic treatment rules from observational data.

作者信息

Moodie Erica E M, Chakraborty Bibhas, Kramer Michael S

机构信息

McGill University, Department of Epidemiology, Biostatistics, and Occupational Health, QC, Canada H3A 1A2.

出版信息

Can J Stat. 2012 Dec 1;40(4):629-645. doi: 10.1002/cjs.11162. Epub 2012 Nov 7.

Abstract

The area of dynamic treatment regimes (DTR) aims to make inference about adaptive, multistage decision-making in clinical practice. A DTR is a set of decision rules, one per interval of treatment, where each decision is a function of treatment and covariate history that returns a recommended treatment. Q-learning is a popular method from the reinforcement learning literature that has recently been applied to estimate DTRs. While, in principle, Q-learning can be used for both randomized and observational data, the focus in the literature thus far has been exclusively on the randomized treatment setting. We extend the method to incorporate measured confounding covariates, using direct adjustment and a variety of propensity score approaches. The methods are examined under various settings including non-regular scenarios. We illustrate the methods in examining the effect of breastfeeding on vocabulary testing, based on data from the Promotion of Breastfeeding Intervention Trial.

摘要

动态治疗方案(DTR)领域旨在对临床实践中的适应性多阶段决策进行推断。一个DTR是一组决策规则,每个治疗间隔对应一个规则,其中每个决策都是治疗和协变量历史的函数,返回推荐的治疗方案。Q学习是强化学习文献中的一种常用方法,最近已被应用于估计DTR。虽然原则上Q学习可用于随机数据和观察数据,但迄今为止文献中的重点一直完全放在随机治疗设置上。我们扩展了该方法,使用直接调整和各种倾向得分方法纳入测量到的混杂协变量。这些方法在包括非正则情形在内的各种设置下进行了检验。我们根据母乳喂养促进干预试验的数据,举例说明了这些方法在检验母乳喂养对词汇测试影响方面的应用。

相似文献

4
Bayesian inference for optimal dynamic treatment regimes in practice.贝叶斯推断在实践中最优动态治疗方案的应用。
Int J Biostat. 2023 May 17;19(2):309-331. doi: 10.1515/ijb-2022-0073. eCollection 2023 Nov 1.
9
Reward ignorant modeling of dynamic treatment regimes.奖励动态治疗方案的无知建模。
Biom J. 2018 Sep;60(5):991-1002. doi: 10.1002/bimj.201700322. Epub 2018 May 30.

引用本文的文献

3
Dynamic Treatment Regimes on Dyadic Networks.二元网络上的动态治疗方案
Stat Med. 2024 Dec 30;43(30):5944-5967. doi: 10.1002/sim.10278. Epub 2024 Nov 28.

本文引用的文献

4
Reinforcement learning strategies for clinical trials in nonsmall cell lung cancer.非小细胞肺癌临床试验的强化学习策略
Biometrics. 2011 Dec;67(4):1422-33. doi: 10.1111/j.1541-0420.2011.01572.x. Epub 2011 Mar 8.
8
Regret-regression for optimal dynamic treatment regimes.用于优化动态治疗方案的后悔回归法。
Biometrics. 2010 Dec;66(4):1192-201. doi: 10.1111/j.1541-0420.2009.01368.x.
10
Inference for non-regular parameters in optimal dynamic treatment regimes.最优动态治疗方案中非正则参数的推断。
Stat Methods Med Res. 2010 Jun;19(3):317-43. doi: 10.1177/0962280209105013. Epub 2009 Jul 16.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验