Hotta Shinji, Kytö Mikko, Koivusalo Saila, Heinonen Seppo, Marttinen Pekka
Department of Computer Science, Aalto University, Espoo, Finland.
Fujitsu Limited, Kawasaki, Japan.
PLoS One. 2024 Aug 1;19(8):e0298506. doi: 10.1371/journal.pone.0298506. eCollection 2024.
In recent years, numerous methods have been introduced to predict glucose levels using machine-learning techniques on patients' daily behavioral and continuous glucose data. Nevertheless, a definitive consensus remains elusive regarding modeling the combined effects of diet and exercise for optimal glucose prediction. A notable challenge is the propensity for observational patient datasets from uncontrolled environments to overfit due to skewed feature distributions of target behaviors; for instance, diabetic patients seldom engage in high-intensity exercise post-meal.
In this study, we introduce a unique application of Bayesian transfer learning for postprandial glucose prediction using randomized controlled trial (RCT) data. The data comprises a time series of three key variables: continuous glucose levels, exercise expenditure, and carbohydrate intake. For building the optimal model to predict postprandial glucose levels we initially gathered balanced training data from RCTs on healthy participants by randomizing behavioral conditions. Subsequently, we pretrained the model's parameter distribution using RCT data from the healthy cohort. This pretrained distribution was then adjusted, transferred, and utilized to determine the model parameters for each patient.
The efficacy of the proposed method was appraised using data from 68 gestational diabetes mellitus (GDM) patients in uncontrolled settings. The evaluation underscored the enhanced performance attained through our method. Furthermore, when modeling the joint impact of diet and exercise, the synergetic model proved more precise than its additive counterpart.
An innovative application of the transfer-learning utilizing randomized controlled trial data can improve the challenging modeling task of postprandial glucose prediction for GDM patients, integrating both dietary and exercise behaviors. For more accurate prediction, future research should focus on incorporating the long-term effects of exercise and other glycemic-related factors such as stress, sleep.
近年来,已经引入了许多方法,利用机器学习技术根据患者的日常行为和连续血糖数据来预测血糖水平。然而,关于对饮食和运动的联合效应进行建模以实现最佳血糖预测,仍未达成明确的共识。一个显著的挑战是,来自不受控制环境的观察性患者数据集由于目标行为的特征分布不均衡而容易出现过拟合;例如,糖尿病患者很少在餐后进行高强度运动。
在本研究中,我们介绍了贝叶斯迁移学习在使用随机对照试验(RCT)数据进行餐后血糖预测中的独特应用。数据包括三个关键变量的时间序列:连续血糖水平、运动消耗和碳水化合物摄入量。为了构建预测餐后血糖水平的最佳模型,我们首先通过对行为条件进行随机化,从健康参与者的RCT中收集了平衡的训练数据。随后,我们使用来自健康队列的RCT数据对模型的参数分布进行预训练。然后对这个预训练的分布进行调整、转移,并用于确定每个患者的模型参数。
使用来自68名未受控制环境下的妊娠期糖尿病(GDM)患者的数据对所提出方法的有效性进行了评估。评估强调了通过我们的方法所取得的性能提升。此外,在对饮食和运动的联合影响进行建模时,协同模型被证明比其相加模型更精确。
利用随机对照试验数据进行迁移学习的创新应用可以改善GDM患者餐后血糖预测这一具有挑战性的建模任务,同时整合饮食和运动行为。为了进行更准确的预测,未来的研究应侧重于纳入运动的长期影响以及其他与血糖相关的因素,如压力、睡眠。