Suppr超能文献

一种用于评估移动健康干预措施的稳健混合效应强化学习算法

A Robust Mixed-Effects Bandit Algorithm for Assessing Mobile Health Interventions.

作者信息

Huch Easton K, Shi Jieru, Abbott Madeline R, Golbus Jessica R, Moreno Alexander, Dempsey Walter H

机构信息

Department of Statistics, University of Michigan, Ann Arbor, MI 48109, USA.

Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.

出版信息

Adv Neural Inf Process Syst. 2024;37:128280-128329.

Abstract

Mobile health leverages personalized, contextually-tailored interventions optimized through bandit and reinforcement learning algorithms. Despite its promise, challenges like participant heterogeneity, nonstationarity, and nonlinearity in rewards hinder algorithm performance. We propose a robust contextual bandit algorithm, termed "DML-TS-NNR", that simultaneously addresses these challenges via (1) modeling the differential reward with user- and time-specific incidental parameters, (2) network cohesion penalties, and (3) debiased machine learning for flexible estimation of baseline rewards. We establish a high-probability regret bound that depends solely on the dimension of the differential reward model. This feature enables us to achieve robust regret bounds even when the baseline reward is highly complex. We demonstrate the superior performance of the DML-TS-NNR algorithm in a simulation and two off-policy evaluation studies.

摘要

移动健康利用通过强盗算法和强化学习算法优化的个性化、情境定制干预措施。尽管它前景广阔,但参与者异质性、非平稳性和奖励非线性等挑战阻碍了算法性能。我们提出了一种稳健的情境强盗算法,称为“DML-TS-NNR”,该算法通过以下方式同时应对这些挑战:(1)使用特定于用户和时间的附带参数对差异奖励进行建模;(2)网络凝聚惩罚;(3)用于灵活估计基线奖励的去偏机器学习。我们建立了一个仅依赖于差异奖励模型维度的高概率遗憾界。这一特性使我们即使在基线奖励非常复杂的情况下也能实现稳健的遗憾界。我们在一项模拟和两项离策略评估研究中展示了DML-TS-NNR算法的卓越性能。

相似文献

9
Predictive modeling of complications arising from early-onset preeclampsia in pregnant women.早发型子痫前期孕妇并发症的预测模型
Womens Health (Lond). 2025 Jan-Dec;21:17455057251348978. doi: 10.1177/17455057251348978. Epub 2025 Jul 21.
10
Incentives for smoking cessation.戒烟的激励措施。
Cochrane Database Syst Rev. 2025 Jan 13;1(1):CD004307. doi: 10.1002/14651858.CD004307.pub7.

本文引用的文献

3
IntelligentPooling: Practical Thompson Sampling for mHealth.智能池化:移动健康领域实用的汤普森采样法
Mach Learn. 2021 Sep;110(9):2685-2727. doi: 10.1007/s10994-021-05995-8. Epub 2021 Jun 21.
4
Personalized Policy Learning using Longitudinal Mobile Health Data.使用纵向移动健康数据的个性化策略学习
J Am Stat Assoc. 2021;116(533):410-420. doi: 10.1080/01621459.2020.1785476. Epub 2020 Aug 11.
7
Action Centered Contextual Bandits.以行动为中心的情境博弈
Adv Neural Inf Process Syst. 2017 Dec;30:5973-5981.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验