Nickerson Paul, Baharloo Raheleh, Davoudi Anis, Bihorac Azra, Rashidi Parisa
Annu Int Conf IEEE Eng Med Biol Soc. 2018 Jul;2018:4106-4109. doi: 10.1109/EMBC.2018.8513303.
Physiological timeseries such as vital signs contain important information about a patient and are used in different clinical application; however, they suffer from missing values and sampling irregularity. In recent years, Gaussian Processes have been used as sophisticated nonlinear value imputation methods on time series, however there is a lack of comparison to other simpler methods. This paper compares the ability of five methods that can be used in missing data imputation in physiological time series. These models are linear interpolation as the baseline, cubic spline interpolation, and three non-linear methods: Single Task Gaussian Processes, Multi-Task Gaussian Processes, and Multivariate Imputation Chained Equations. We used seven intraoperative physiological time series from 27,481 patients. Piecewise aggregate approximation was employed as a dimensionality reduction and resampling strategy. Linear interpolation and cubic splining show overall superiority in prediction of the missing values, compared to the other complex models. The performance of the kernel-based methods suggest that they are highly sensitive to the kernel width and require incorporation of domain knowledge for fine-tuning.
诸如生命体征之类的生理时间序列包含有关患者的重要信息,并被用于不同的临床应用中;然而,它们存在缺失值和采样不规则的问题。近年来,高斯过程已被用作时间序列上复杂的非线性值插补方法,但是与其他更简单的方法相比还比较欠缺。本文比较了五种可用于生理时间序列中缺失数据插补的方法的能力。这些模型以线性插值作为基线、三次样条插值,以及三种非线性方法:单任务高斯过程、多任务高斯过程和链式方程多元插补。我们使用了来自27481名患者的七个术中生理时间序列。采用分段聚合近似作为降维和重采样策略。与其他复杂模型相比,线性插值和三次样条插值在预测缺失值方面总体上具有优势。基于核的方法的性能表明,它们对核宽度高度敏感,需要结合领域知识进行微调。