Oroojeni Mohammad Javad Mahsa, Agboola Stephen Olusegun, Jethwani Kamal, Zeid Abe, Kamarthi Sagar
Department of Information Technology and Analytics, Kogod School of Business, American University, Washington, DC, United States.
Department of Dermatology, Harvard Medical School, Boston, MA, United States.
JMIR Diabetes. 2019 Aug 28;4(3):e12905. doi: 10.2196/12905.
Type 1 diabetes mellitus (T1DM) is characterized by chronic insulin deficiency and consequent hyperglycemia. Patients with T1DM require long-term exogenous insulin therapy to regulate blood glucose levels and prevent the long-term complications of the disease. Currently, there are no effective algorithms that consider the unique characteristics of T1DM patients to automatically recommend personalized insulin dosage levels.
The objective of this study was to develop and validate a general reinforcement learning (RL) framework for the personalized treatment of T1DM using clinical data.
This research presents a model-free data-driven RL algorithm, namely Q-learning, that recommends insulin doses to regulate the blood glucose level of a T1DM patient, considering his or her state defined by glycated hemoglobin (HbA) levels, body mass index, engagement in physical activity, and alcohol usage. In this approach, the RL agent identifies the different states of the patient by exploring the patient's responses when he or she is subjected to varying insulin doses. On the basis of the result of a treatment action at time step t, the RL agent receives a numeric reward, positive or negative. The reward is calculated as a function of the difference between the actual blood glucose level achieved in response to the insulin dose and the targeted HbA level. The RL agent was trained on 10 years of clinical data of patients treated at the Mass General Hospital.
A total of 87 patients were included in the training set. The mean age of these patients was 53 years, 59% (51/87) were male, 86% (75/87) were white, and 47% (41/87) were married. The performance of the RL agent was evaluated on 60 test cases. RL agent-recommended insulin dosage interval includes the actual dose prescribed by the physician in 53 out of 60 cases (53/60, 88%).
This exploratory study demonstrates that an RL algorithm can be used to recommend personalized insulin doses to achieve adequate glycemic control in patients with T1DM. However, further investigation in a larger sample of patients is needed to confirm these findings.
1型糖尿病(T1DM)的特征是慢性胰岛素缺乏及随之而来的高血糖。T1DM患者需要长期的外源性胰岛素治疗来调节血糖水平并预防该疾病的长期并发症。目前,尚无有效的算法能够考虑T1DM患者的独特特征来自动推荐个性化的胰岛素剂量水平。
本研究的目的是使用临床数据开发并验证一个用于T1DM个性化治疗的通用强化学习(RL)框架。
本研究提出了一种无模型的数据驱动型RL算法,即Q学习,该算法根据糖化血红蛋白(HbA)水平、体重指数、体力活动参与情况和饮酒情况所定义的患者状态,推荐胰岛素剂量以调节T1DM患者的血糖水平。在这种方法中,RL智能体通过探索患者在接受不同胰岛素剂量时的反应来识别患者的不同状态。根据时间步长t时治疗行动的结果,RL智能体获得一个数值奖励,奖励可为正或为负。该奖励根据响应胰岛素剂量所达到的实际血糖水平与目标HbA水平之间的差异来计算。RL智能体在麻省总医院治疗的患者的10年临床数据上进行训练。
训练集共纳入87例患者。这些患者的平均年龄为53岁,59%(51/87)为男性,86%(75/87)为白人,47%(41/87)已婚。在60个测试病例上评估了RL智能体的性能。RL智能体推荐的胰岛素剂量区间在60个病例中有53个(53/60,88%)包含医生实际开具的剂量。
这项探索性研究表明,RL算法可用于推荐个性化胰岛素剂量,以实现T1DM患者的血糖充分控制。然而,需要在更大样本的患者中进行进一步研究以证实这些发现。