Department of Biomedical Informatics, School of Medicine, Ajou University, Suwon, Gyeonggi-do, Republic of Korea.
Department of Brain Science, School of Medicine, Ajou University, Suwon, Gyeonggi-do, Republic of Korea.
JMIR Mhealth Uhealth. 2020 Jul 23;8(7):e16113. doi: 10.2196/16113.
Data collected by an actigraphy device worn on the wrist or waist can provide objective measurements for studies related to physical activity; however, some data may contain intervals where values are missing. In previous studies, statistical methods have been applied to impute missing values on the basis of statistical assumptions. Deep learning algorithms, however, can learn features from the data without any such assumptions and may outperform previous approaches in imputation tasks.
The aim of this study was to impute missing values in data using a deep learning approach.
To develop an imputation model for missing values in accelerometer-based actigraphy data, a denoising convolutional autoencoder was adopted. We trained and tested our deep learning-based imputation model with the National Health and Nutrition Examination Survey data set and validated it with the external Korea National Health and Nutrition Examination Survey and the Korean Chronic Cerebrovascular Disease Oriented Biobank data sets which consist of daily records measuring activity counts. The partial root mean square error and partial mean absolute error of the imputed intervals (partial RMSE and partial MAE, respectively) were calculated using our deep learning-based imputation model (zero-inflated denoising convolutional autoencoder) as well as using other approaches (mean imputation, zero-inflated Poisson regression, and Bayesian regression).
The zero-inflated denoising convolutional autoencoder exhibited a partial RMSE of 839.3 counts and partial MAE of 431.1 counts, whereas mean imputation achieved a partial RMSE of 1053.2 counts and partial MAE of 545.4 counts, the zero-inflated Poisson regression model achieved a partial RMSE of 1255.6 counts and partial MAE of 508.6 counts, and Bayesian regression achieved a partial RMSE of 924.5 counts and partial MAE of 605.8 counts.
Our deep learning-based imputation model performed better than the other methods when imputing missing values in actigraphy data.
佩戴在手腕或腰部的活动记录仪收集的数据可提供与身体活动相关研究的客观测量值;然而,部分数据可能包含缺失值的区间。在之前的研究中,统计方法已被应用于基于统计假设来插补缺失值。然而,深度学习算法可以从数据中学习特征,而无需任何此类假设,并且在插补任务中可能优于之前的方法。
本研究旨在使用深度学习方法插补活动记录仪数据中的缺失值。
为了开发基于加速度计的活动记录仪数据中缺失值的插补模型,采用了去噪卷积自动编码器。我们使用国家健康和营养检查调查数据集训练和测试我们的基于深度学习的插补模型,并使用外部韩国国家健康和营养检查调查以及韩国慢性脑血管疾病定向生物库数据集进行验证,这些数据集包含测量活动计数的日常记录。使用我们的基于深度学习的插补模型(零膨胀去噪卷积自动编码器)以及其他方法(均值插补、零膨胀泊松回归和贝叶斯回归)计算插补区间的部分均方根误差和部分平均绝对误差(部分 RMSE 和部分 MAE)。
零膨胀去噪卷积自动编码器的部分 RMSE 为 839.3 计数,部分 MAE 为 431.1 计数,而均值插补的部分 RMSE 为 1053.2 计数,部分 MAE 为 545.4 计数,零膨胀泊松回归模型的部分 RMSE 为 1255.6 计数,部分 MAE 为 508.6 计数,贝叶斯回归的部分 RMSE 为 924.5 计数,部分 MAE 为 605.8 计数。
在插补活动记录仪数据中的缺失值时,我们的基于深度学习的插补模型的性能优于其他方法。