Kalita Deepjyoti, Sharma Hrishita, Mirza Khalid B
IEEE J Biomed Health Inform. 2024 Aug;28(8):4963-4974. doi: 10.1109/JBHI.2024.3396880. Epub 2024 Aug 6.
Artificial pancreas requires data from multiple sources for accurate insulin dose estimation. These include data from continuous glucose sensors, past insulin dosage information, meal quantity and time and physical activity data. The effectiveness of closed-loop diabetes management systems might be hampered by the absence of these data caused by device error or lack of compliance by patients. In this study, we demonstrate the effect of output sequence length-driven generative and discriminative model selection in high quality data generation and augmentation. This novel generative adversarial network (GAN) based architecture automatically selects the generator and discriminator architecture based on the desired output sequence length. The proposed model is able to generate glucose, physical activity, meal information data for individual patients. The discriminative scores for Ohio T1DM (2018) dataset were 0.17 ±0.03 (Inputs: CGM, CHO, Insulin) and 0.15 ±0.02 (Inputs: CGM, CHO, Insulin, Heart Rate, Steps) and for Ohio T1D (2020) dataset was 0.16 ±0.02 (Inputs: CGM, CHO, Insulin) and 0.15 ±0.02 (Inputs: CGM, CHO, Insulin, acceleration). A mixture of generated and real data was used to test predictive scores for glucose forecasting models. The best RMSE and MARD achieved for OhioT1DM patients were 17.19 ±3.22 and 7.14 ±1.76 for PH=30 min with CGM, CHO, Insulin, heartrate and steps as inputs. Similarly, the RMSE and MARD for real+synthetic data were 15.63 ±2.57 and 5.86 ±1.69 respectively. Compared to existing generative models, we demonstrate that sequence length based architecture selection leads to better synthetic data generation for multiple output sequences (CGM, CHO, Insulin) and forecasting accuracy.
人工胰腺需要来自多个来源的数据来准确估计胰岛素剂量。这些数据包括连续血糖传感器的数据、过去的胰岛素剂量信息、进餐量和时间以及身体活动数据。由于设备故障或患者依从性差导致这些数据缺失,可能会妨碍闭环糖尿病管理系统的有效性。在本研究中,我们展示了输出序列长度驱动的生成模型和判别模型选择在高质量数据生成和增强中的作用。这种基于新型生成对抗网络(GAN)的架构会根据所需的输出序列长度自动选择生成器和判别器架构。所提出的模型能够为个体患者生成血糖、身体活动、进餐信息数据。俄亥俄T1DM(2018)数据集的判别分数为0.17±0.03(输入:连续血糖监测、碳水化合物、胰岛素)和0.15±0.02(输入:连续血糖监测、碳水化合物、胰岛素、心率、步数),俄亥俄T1D(2020)数据集的判别分数为0.16±0.02(输入:连续血糖监测、碳水化合物、胰岛素)和0.15±0.02(输入:连续血糖监测、碳水化合物、胰岛素、加速度)。生成数据和真实数据的混合用于测试血糖预测模型的预测分数。对于俄亥俄T1DM患者,以连续血糖监测、碳水化合物、胰岛素、心率和步数为输入,预测时长(PH)为30分钟时,最佳均方根误差(RMSE)和平均绝对相对差(MARD)分别为17.19±3.22和7.14±1.76。同样,真实数据与合成数据的RMSE和MARD分别为15.63±2.57和5.86±1.69。与现有的生成模型相比,我们证明基于序列长度的架构选择能为多个输出序列(连续血糖监测、碳水化合物、胰岛素)生成更好的合成数据,并提高预测准确性。