IEEE J Biomed Health Inform. 2021 Apr;25(4):1223-1232. doi: 10.1109/JBHI.2020.3014556. Epub 2021 Apr 6.
People with Type 1 diabetes (T1D) require regular exogenous infusion of insulin to maintain their blood glucose concentration in a therapeutically adequate target range. Although the artificial pancreas and continuous glucose monitoring have been proven to be effective in achieving closed-loop control, significant challenges still remain due to the high complexity of glucose dynamics and limitations in the technology. In this work, we propose a novel deep reinforcement learning model for single-hormone (insulin) and dual-hormone (insulin and glucagon) delivery. In particular, the delivery strategies are developed by double Q-learning with dilated recurrent neural networks. For designing and testing purposes, the FDA-accepted UVA/Padova Type 1 simulator was employed. First, we performed long-term generalized training to obtain a population model. Then, this model was personalized with a small data-set of subject-specific data. In silico results show that the single and dual-hormone delivery strategies achieve good glucose control when compared to a standard basal-bolus therapy with low-glucose insulin suspension. Specifically, in the adult cohort (n = 10), percentage time in target range 70, 180 mg/dL improved from 77.6% to 80.9% with single-hormone control, and to 85.6% with dual-hormone control. In the adolescent cohort (n = 10), percentage time in target range improved from 55.5% to [Formula: see text] with single-hormone control, and to 78.8% with dual-hormone control. In all scenarios, a significant decrease in hypoglycemia was observed. These results show that the use of deep reinforcement learning is a viable approach for closed-loop glucose control in T1D.
1 型糖尿病(T1D)患者需要定期外源性输注胰岛素,以将其血糖浓度维持在治疗上足够的靶标范围内。尽管人工胰腺和连续血糖监测已被证明可有效实现闭环控制,但由于血糖动态的高度复杂性和技术限制,仍存在重大挑战。在这项工作中,我们提出了一种用于单激素(胰岛素)和双激素(胰岛素和胰高血糖素)给药的新型深度强化学习模型。特别是,通过带扩张循环神经网络的双 Q 学习来开发给药策略。为了设计和测试目的,采用了经 FDA 认可的 UVA/Padova 1 型模拟器。首先,我们进行了长期广义训练,以获得群体模型。然后,使用少量主题特定数据对该模型进行个性化处理。仿真结果表明,与使用低血糖胰岛素悬浮液的标准基础- bolus 疗法相比,单激素和双激素给药策略可实现良好的血糖控制。具体而言,在成年队列(n = 10)中,与单激素控制相比,70、180 mg/dL 目标范围内的时间百分比从 77.6%提高到 80.9%,而双激素控制则提高到 85.6%。在青少年队列(n = 10)中,与单激素控制相比,目标范围内的时间百分比从 55.5%提高到[公式:见正文],而双激素控制则提高到 78.8%。在所有情况下,低血糖的发生均显著减少。这些结果表明,深度强化学习的使用是 T1D 闭环血糖控制的一种可行方法。