School of Information Science and Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, China.
Shanghai Hospital Development Center, 2 Kangding Road, Shanghai, 200000, China.
BMC Med Inform Decis Mak. 2019 Dec 17;19(Suppl 8):259. doi: 10.1186/s12911-019-0985-7.
Electronic health records (EHRs) provide possibilities to improve patient care and facilitate clinical research. However, there are many challenges faced by the applications of EHRs, such as temporality, high dimensionality, sparseness, noise, random error and systematic bias. In particular, temporal information is difficult to effectively use by traditional machine learning methods while the sequential information of EHRs is very useful.
In this paper, we propose a general-purpose patient representation learning approach to summarize sequential EHRs. Specifically, a recurrent neural network based denoising autoencoder (RNN-DAE) is employed to encode inhospital records of each patient into a low dimensional dense vector.
Based on EHR data collected from Shuguang Hospital affiliated to Shanghai University of Traditional Chinese Medicine, we experimentally evaluate our proposed RNN-DAE method on both mortality prediction task and comorbidity prediction task. Extensive experimental results show that our proposed RNN-DAE method outperforms existing methods. In addition, we apply the "Deep Feature" represented by our proposed RNN-DAE method to track similar patients with t-SNE, which also achieves some interesting observations.
We propose an effective unsupervised RNN-DAE method to summarize patient sequential information in EHR data. Our proposed RNN-DAE method is useful on both mortality prediction task and comorbidity prediction task.
电子健康记录(EHR)提供了改善患者护理和促进临床研究的可能性。然而,EHR 的应用面临许多挑战,例如时间性、高维度、稀疏性、噪声、随机误差和系统偏差。特别是,传统的机器学习方法很难有效地利用时间信息,而 EHR 的顺序信息非常有用。
在本文中,我们提出了一种通用的患者表示学习方法,用于总结顺序 EHR。具体来说,我们采用基于递归神经网络的去噪自动编码器(RNN-DAE)将每个患者的住院记录编码为低维密集向量。
基于上海中医药大学曙光医院采集的 EHR 数据,我们在死亡率预测任务和合并症预测任务上对我们提出的 RNN-DAE 方法进行了实验评估。大量的实验结果表明,我们提出的 RNN-DAE 方法优于现有方法。此外,我们应用我们提出的 RNN-DAE 方法所表示的“深度特征”通过 t-SNE 跟踪相似患者,这也得到了一些有趣的观察结果。
我们提出了一种有效的无监督 RNN-DAE 方法来总结 EHR 数据中的患者顺序信息。我们提出的 RNN-DAE 方法在死亡率预测任务和合并症预测任务上都很有用。