Suppr超能文献

基于循环条件瓦瑟斯坦生成对抗网络和注意力机制的医学多变量时间序列插补与预测

Medical multivariate time series imputation and forecasting based on a recurrent conditional Wasserstein GAN and attention.

作者信息

Festag Sven, Spreckelsen Cord

机构信息

Institute of Medical Statistics, Computer and Data Sciences, Jena University Hospital, Germany; SMITH consortium of the German Medical Informatics Initiative, Germany.

出版信息

J Biomed Inform. 2023 Mar;139:104320. doi: 10.1016/j.jbi.2023.104320. Epub 2023 Feb 13.

Abstract

OBJECTIVE

In the fields of medical care and research as well as hospital management, time series are an important part of the overall data basis. To ensure high quality standards and enable suitable decisions, tools for precise and generic imputations and forecasts that integrate the temporal dynamics are of great importance. Since forecasting and imputation tasks involve an inherent uncertainty, the focus of our work lay on a probabilistic multivariate generative approach that samples infillings or forecasts from an analysable distribution rather than producing deterministic results.

MATERIALS AND METHODS

For this task, we developed a system based on generative adversarial networks that consist of recurrent encoders and decoders with attention mechanisms and can learn the distribution of intervals from multivariate time series conditioned on the periods before and, if available, periods after the values that are to be predicted. For training, validation and testing, a data set of jointly measured blood pressure series (ABP) and electrocardiograms (ECG) (length: 1,250=ˆ10s) was generated. For the imputation tasks, one interval of fixed length was masked randomly and independently in both channels of every sample. For the forecasting task, all masks were positioned at the end.

RESULTS

The models were trained on around 65,000 bivariate samples and tested against 14,000 series of different persons. For the evaluation, 50 samples were produced for every masked interval to estimate the range of the generated infillings or forecasts. The element-wise arithmetic average of these samples served as an estimator for the mean of the learned conditional distribution. The approach showed better results than a state-of-the-art probabilistic multivariate forecasting mechanism based on Gaussian copula transformation and recurrent neural networks. On the imputation task, the proposed method reached a mean squared error (MSE) of 0.057 on the ECG channel and an MSE of 28.30 on the ABP channel, while the baseline approach reached MSEs of 0.095 (ECG) and 229.1 (ABP). Moreover, on the forecasting task, the presented system achieved MSEs of 0.069 (ECG) and 33.73 (ABP), outperforming the recurrent copula approach, which reached MSEs of 0.082 (ECG) and 196.53 (ABP).

CONCLUSION

The presented generative probabilistic system for the imputation and forecasting of (medical) time series features the flexibility to handle masks of different sizes and positions, the ability to quantify uncertainty due to its probabilistic predictions, and an adjustable trade-off between the goals of minimising errors in individual predictions and minimising the distance between the learned and the real conditional distribution of the infillings or forecasts.

摘要

目的

在医疗保健、研究以及医院管理领域,时间序列是整体数据基础的重要组成部分。为确保高质量标准并做出合适决策,能够整合时间动态的精确且通用的插补和预测工具至关重要。由于预测和插补任务存在内在不确定性,我们工作的重点在于一种概率多元生成方法,该方法从可分析分布中对填充值或预测值进行采样,而非产生确定性结果。

材料与方法

针对此任务,我们开发了一个基于生成对抗网络的系统,该系统由具有注意力机制的循环编码器和解码器组成,能够根据预测值之前的时间段以及(若有)之后的时间段,从多元时间序列中学习区间分布。为进行训练、验证和测试,生成了一个联合测量的血压序列(ABP)和心电图(ECG)的数据集(长度:1250 = 10秒)。对于插补任务,在每个样本的两个通道中随机且独立地掩盖一个固定长度的区间。对于预测任务,所有掩码都位于末尾。

结果

模型在约65000个双变量样本上进行训练,并针对14000个不同人的序列进行测试。为进行评估,为每个被掩盖区间生成50个样本,以估计生成的填充值或预测值的范围。这些样本的逐元素算术平均值用作学习到的条件分布均值的估计器。该方法比基于高斯Copula变换和循环神经网络的现有概率多元预测机制表现出更好的结果。在插补任务中,所提出的方法在ECG通道上的均方误差(MSE)为0.057,在ABP通道上的MSE为28.30,而基线方法在ECG通道上的MSE为0.095,在ABP通道上的MSE为229.1。此外,在预测任务中,所展示的系统在ECG通道上的MSE为0.069,在ABP通道上的MSE为33.73,优于循环Copula方法,后者在ECG通道上的MSE为0.082,在ABP通道上为MSE为196.53。

结论

所展示的用于(医学)时间序列插补和预测的生成概率系统具有处理不同大小和位置掩码的灵活性、因其概率预测而量化不确定性的能力,以及在最小化单个预测误差目标与最小化学习到的填充值或预测值与真实条件分布之间距离目标之间的可调节权衡。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验