School of Electrical Engineering and Computer Science, Washington State University, Pullman, Washington, United States.
Methods Inf Med. 2022 Sep;61(3-04):99-110. doi: 10.1055/s-0042-1756649. Epub 2022 Oct 11.
Behavior and health are inextricably linked. As a result, continuous wearable sensor data offer the potential to predict clinical measures. However, interruptions in the data collection occur, which create a need for strategic data imputation.
The objective of this work is to adapt a data generation algorithm to impute multivariate time series data. This will allow us to create digital behavior markers that can predict clinical health measures.
We created a bidirectional time series generative adversarial network to impute missing sensor readings. Values are imputed based on relationships between multiple fields and multiple points in time, for single time points or larger time gaps. From the complete data, digital behavior markers are extracted and are mapped to predicted clinical measures.
We validate our approach using continuous smartwatch data for = 14 participants. When reconstructing omitted data, we observe an average normalized mean absolute error of 0.0197. We then create machine learning models to predict clinical measures from the reconstructed, complete data with correlations ranging from = 0.1230 to = 0.7623. This work indicates that wearable sensor data collected in the wild can be used to offer insights on a person's health in natural settings.
行为与健康密切相关。因此,连续的可穿戴传感器数据具有预测临床指标的潜力。但是,数据采集会中断,这就需要进行策略性的数据插补。
本工作旨在改编数据生成算法以插补多变量时间序列数据。这将使我们能够创建可预测临床健康指标的数字行为标记。
我们创建了一个双向时间序列生成对抗网络来插补缺失的传感器读数。基于多个字段和多个时间点之间的关系,对单个时间点或更大的时间间隔进行值插补。从完整的数据中提取数字行为标记,并将其映射到预测的临床指标。
我们使用 14 名参与者的连续智能手表数据验证了我们的方法。在对遗漏数据进行重构时,我们观察到平均归一化平均绝对误差为 0.0197。然后,我们创建了机器学习模型,从重构的完整数据中预测临床指标,相关系数范围从 0.1230 到 0.7623。这项工作表明,在自然环境中,可以使用在野外采集的可穿戴传感器数据来提供有关个人健康状况的深入了解。