Di Shuang, Petch Jeremy, Gerstein Hertzel C, Zhu Ruoqing, Sherifali Diana
Centre for Data Science and Digital Health, Hamilton Health Sciences, Hamilton, ON, Canada.
Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.
JMIR Form Res. 2022 Sep 13;6(9):e37838. doi: 10.2196/37838.
Health coaching is an emerging intervention that has been shown to improve clinical and patient-relevant outcomes for type 2 diabetes. Advances in artificial intelligence may provide an avenue for developing a more personalized, adaptive, and cost-effective approach to diabetes health coaching.
We aim to apply Q-learning, a widely used reinforcement learning algorithm, to a diabetes health-coaching data set to develop a model for recommending an optimal coaching intervention at each decision point that is tailored to a patient's accumulated history.
In this pilot study, we fit a two-stage reinforcement learning model on 177 patients from the intervention arm of a community-based randomized controlled trial conducted in Canada. The policy produced by the reinforcement learning model can recommend a coaching intervention at each decision point that is tailored to a patient's accumulated history and is expected to maximize the composite clinical outcome of hemoglobin A reduction and quality of life improvement (normalized to [ 0, 1 ], with a higher score being better). Our data, models, and source code are publicly available.
Among the 177 patients, the coaching intervention recommended by our policy mirrored the observed diabetes health coach's interventions in 17.5% (n=31) of the patients in stage 1 and 14.1% (n=25) of the patients in stage 2. Where there was agreement in both stages, the average cumulative composite outcome (0.839, 95% CI 0.460-1.220) was better than those for whom the optimal policy agreed with the diabetes health coach in only one stage (0.791, 95% CI 0.747-0.836) or differed in both stages (0.755, 95% CI 0.728-0.781). Additionally, the average cumulative composite outcome predicted for the policy's recommendations was significantly better than that of the observed diabetes health coach's recommendations (t=10.040; P<.001).
Applying reinforcement learning to diabetes health coaching could allow for both the automation of health coaching and an improvement in health outcomes produced by this type of intervention.
健康指导是一种新兴的干预措施,已被证明可改善2型糖尿病的临床及与患者相关的结局。人工智能的进展可能为开发更个性化、适应性更强且成本效益更高的糖尿病健康指导方法提供一条途径。
我们旨在将一种广泛使用的强化学习算法——Q学习应用于糖尿病健康指导数据集,以开发一个模型,用于在每个决策点推荐针对患者累积病史量身定制的最佳指导干预措施。
在这项试点研究中,我们对来自加拿大一项基于社区的随机对照试验干预组的177名患者拟合了一个两阶段强化学习模型。强化学习模型产生的策略可以在每个决策点推荐针对患者累积病史量身定制的指导干预措施,并有望使血红蛋白A降低和生活质量改善的综合临床结局最大化(归一化为[0, 1],分数越高越好)。我们的数据、模型和源代码均可公开获取。
在这177名患者中,我们的策略推荐的指导干预措施与观察到的糖尿病健康指导师在第1阶段17.5%(n = 31)的患者和第2阶段14.1%(n = 25)的患者中的干预措施一致。在两个阶段都达成一致的情况下,平均累积综合结局(0.839,95%置信区间0.460 - 1.220)优于那些最佳策略仅在一个阶段与糖尿病健康指导师一致的患者(0.791,95%置信区间0.747 - 0.836)或在两个阶段都不同的患者(0.755,95%置信区间0.728 - 0.781)。此外,该策略推荐所预测的平均累积综合结局显著优于观察到的糖尿病健康指导师的推荐(t = 10.040;P <.001)。
将强化学习应用于糖尿病健康指导可以实现健康指导的自动化,并改善这类干预措施所产生的健康结局。