ReBandit：基于随机效应的在线强化学习算法用于减少大麻使用

ReBandit: Random Effects Based Online RL Algorithm for Reducing Cannabis Use.

作者信息

Ghosh Susobhan, Guo Yongyi, Hung Pei-Yao, Coughlin Lara, Bonar Erin, Nahum-Shani Inbal, Walton Maureen, Murphy Susan

机构信息

Department of Computer Science, Harvard University.

Department of Statistics, University of Wisconsin-Madison.

出版信息

IJCAI (U S). 2024 Aug;2024:7278-7286.

PMID:39735853

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11671148/

Abstract

The escalating prevalence of cannabis use, and associated cannabis-use disorder (CUD), poses a significant public health challenge globally. With a notably wide treatment gap, especially among emerging adults (EAs; ages 18-25), addressing cannabis use and CUD remains a pivotal objective within the 2030 United Nations Agenda for Sustainable Development Goals (SDG). In this work, we develop an online reinforcement learning (RL) algorithm called reBandit which will be utilized in a mobile health study to deliver personalized mobile health interventions aimed at reducing cannabis use among EAs. reBandit utilizes and to learn quickly and efficiently in noisy mobile health environments. Moreover, reBandit employs Empirical Bayes and optimization techniques to autonomously update its hyper-parameters online. To evaluate the performance of our algorithm, we construct a simulation testbed using data from a prior study, and compare against commonly used algorithms in mobile health studies. We show that reBandit performs equally well or better than all the baseline algorithms, and the performance gap widens as population heterogeneity increases in the simulation environment, proving its adeptness to adapt to diverse population of study participants.

摘要

大麻使用及相关大麻使用障碍（CUD）的患病率不断上升，这在全球范围内构成了重大的公共卫生挑战。由于存在明显的治疗差距，尤其是在新兴成年人（18至25岁）中，应对大麻使用和CUD仍然是《2030年联合国可持续发展目标议程》（SDG）中的一个关键目标。在这项工作中，我们开发了一种名为reBandit的在线强化学习（RL）算法，该算法将用于一项移动健康研究，以提供旨在减少新兴成年人中大麻使用的个性化移动健康干预措施。reBandit利用[具体内容缺失]在嘈杂的移动健康环境中快速高效地学习。此外，reBandit采用经验贝叶斯和优化技术在线自主更新其超参数。为了评估我们算法的性能，我们使用先前一项研究的数据构建了一个模拟测试平台，并与移动健康研究中常用的算法进行比较。我们表明，reBandit的表现与所有基线算法相当或更好，并且随着模拟环境中人群异质性的增加，性能差距会扩大，这证明了它能够适应不同的研究参与者群体。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a14/11671148/a8b0cc1cbed2/nihms-2002484-f0002.jpg

相似文献

ReBandit: Random Effects Based Online RL Algorithm for Reducing Cannabis Use.ReBandit：基于随机效应的在线强化学习算法用于减少大麻使用

IJCAI (U S). 2024 Aug;2024:7278-7286.

Cannabis use, risk of cannabis use disorder, and anxiety and depression among bisexual patients: A comparative study of sex and sexual identity differences in a large health system.双性恋患者中的大麻使用、大麻使用障碍风险以及焦虑和抑郁：大型医疗系统中性别与性取向差异的比较研究

Drug Alcohol Depend. 2025 Jun 20;274:112762. doi: 10.1016/j.drugalcdep.2025.112762.

Psychosocial interventions for cannabis use disorder.针对大麻使用障碍的心理社会干预措施。

Cochrane Database Syst Rev. 2016 May 5;2016(5):CD005336. doi: 10.1002/14651858.CD005336.pub4.

A New Measure of Quantified Social Health Is Associated With Levels of Discomfort, Capability, and Mental and General Health Among Patients Seeking Musculoskeletal Specialty Care.一种新的量化社会健康指标与寻求肌肉骨骼专科护理的患者的不适程度、能力以及心理和总体健康水平相关。

Clin Orthop Relat Res. 2025 Apr 1;483(4):647-663. doi: 10.1097/CORR.0000000000003394. Epub 2025 Feb 5.

Factors that impact on the use of mechanical ventilation weaning protocols in critically ill adults and children: a qualitative evidence-synthesis.影响重症成人和儿童机械通气撤机方案使用的因素：一项定性证据综合分析

Cochrane Database Syst Rev. 2016 Oct 4;10(10):CD011812. doi: 10.1002/14651858.CD011812.pub2.

Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能？开发一种互联网应用算法。

Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.

Cannabis and schizophrenia.大麻与精神分裂症。

Cochrane Database Syst Rev. 2014 Oct 14;2014(10):CD004837. doi: 10.1002/14651858.CD004837.pub3.

Algorithm-based pain management for people with dementia in nursing homes.基于算法的养老院痴呆患者疼痛管理。

Cochrane Database Syst Rev. 2022 Apr 1;4(4):CD013339. doi: 10.1002/14651858.CD013339.pub2.

Sertindole for schizophrenia.用于治疗精神分裂症的舍吲哚。

Cochrane Database Syst Rev. 2005 Jul 20;2005(3):CD001715. doi: 10.1002/14651858.CD001715.pub2.

Behavioral interventions to reduce risk for sexual transmission of HIV among men who have sex with men.降低男男性行为者中艾滋病毒性传播风险的行为干预措施。

Cochrane Database Syst Rev. 2008 Jul 16(3):CD001230. doi: 10.1002/14651858.CD001230.pub2.

引用本文的文献

Virtual Reality in Prevention and Treatment of Substance-Related Disorders: A Systematic Review of Randomized Controlled Trials.虚拟现实在物质相关障碍预防与治疗中的应用：随机对照试验的系统评价

Clin Psychol Psychother. 2025 Jul-Aug;32(4):e70144. doi: 10.1002/cpp.70144.

A Deployed Online Reinforcement Learning Algorithm In An Oral Health Clinical Trial.一种应用于口腔健康临床试验的在线强化学习算法

Proc AAAI Conf Artif Intell. 2025;39(28):28792-28800. doi: 10.1609/aaai.v39i28.35143. Epub 2025 Apr 11.

本文引用的文献

Reward Design For An Online Reinforcement Learning Algorithm Supporting Oral Self-Care.支持口腔自我护理的在线强化学习算法的奖励设计

Proc Innov Appl Artif Intell Conf. 2023 Jun 27;37(13):15724-15730. doi: 10.1609/aaai.v37i13.26866.

Designing Reinforcement Learning Algorithms for Digital Interventions: Pre-Implementation Guidelines.为数字干预设计强化学习算法：实施前指南。

Algorithms. 2022 Aug;15(8). doi: 10.3390/a15080255. Epub 2022 Jul 22.

Engagement in digital interventions.参与数字干预措施。

Am Psychol. 2022 Oct;77(7):836-852. doi: 10.1037/amp0000983. Epub 2022 Mar 17.

Translating strategies for promoting engagement in mobile health: A proof-of-concept microrandomized trial.促进移动健康参与的翻译策略：概念验证微随机试验。

Health Psychol. 2021 Dec;40(12):974-987. doi: 10.1037/hea0001101. Epub 2021 Nov 4.

IntelligentPooling: Practical Thompson Sampling for mHealth.智能池化：移动健康领域实用的汤普森采样法

Mach Learn. 2021 Sep;110(9):2685-2727. doi: 10.1007/s10994-021-05995-8. Epub 2021 Jun 21.

Personalized HeartSteps: A Reinforcement Learning Algorithm for Optimizing Physical Activity.个性化心脏运动计划：一种用于优化身体活动的强化学习算法

Proc ACM Interact Mob Wearable Ubiquitous Technol. 2020 Mar;4(1). doi: 10.1145/3381007.

Young-adult compared to adolescent onset of regular cannabis use: A 20-year prospective cohort study of later consequences.青少年时期与成年早期开始规律使用大麻：一项 20 年前瞻性队列研究的后期后果。

Drug Alcohol Rev. 2021 May;40(4):627-636. doi: 10.1111/dar.13239. Epub 2021 Jan 26.

Microrandomized Trial Design for Evaluating Just-in-Time Adaptive Interventions Through Mobile Health Technologies for Cardiovascular Disease.通过移动健康技术评估心血管疾病即时自适应干预措施的微随机试验设计。

Circ Cardiovasc Qual Outcomes. 2021 Feb;14(2):e006760. doi: 10.1161/CIRCOUTCOMES.120.006760. Epub 2021 Jan 12.

Personalizing Mobile Fitness Apps using Reinforcement Learning.利用强化学习实现移动健身应用的个性化定制。

CEUR Workshop Proc. 2018 Mar 7;2068.

Investigating Intervention Components and Exploring States of Receptivity for a Smartphone App to Promote Physical Activity: Protocol of a Microrandomized Trial.探究一款促进身体活动的智能手机应用程序的干预组成部分并探索接受状态：一项微随机试验方案

JMIR Res Protoc. 2019 Jan 31;8(1):e11540. doi: 10.2196/11540.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验