School of Biomedical Informatics, University of Texas Health Science Center at Houston (UTHealth), Houston, TX, United States.
Department of Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States.
J Biomed Inform. 2018 Aug;84:11-16. doi: 10.1016/j.jbi.2018.06.011. Epub 2018 Jun 15.
Recently, recurrent neural networks (RNNs) have been applied in predicting disease onset risks with Electronic Health Record (EHR) data. While these models demonstrated promising results on relatively small data sets, the generalizability and transferability of those models and its applicability to different patient populations across hospitals have not been evaluated. In this study, we evaluated an RNN model, RETAIN, over Cerner Health Facts® EMR data, for heart failure onset risk prediction. Our data set included over 150,000 heart failure patients and over 1,000,000 controls from nearly 400 hospitals. Convincingly, RETAIN achieved an AUC of 82% in comparison to an AUC of 79% for logistic regression, demonstrating the power of more expressive deep learning models for EHR predictive modeling. The prediction performance fluctuated across different patient groups and varied from hospital to hospital. Also, we trained RETAIN models on individual hospitals and found that the model can be applied to other hospitals with only about 3.6% of reduction of AUC. Our results demonstrated the capability of RNN for predictive modeling with large and heterogeneous EHR data, and pave the road for future improvements.
最近,递归神经网络 (RNN) 已应用于通过电子健康记录 (EHR) 数据预测疾病发病风险。虽然这些模型在相对较小的数据集上表现出了有前景的结果,但这些模型的泛化能力和可转移性及其在不同医院的不同患者群体中的适用性尚未得到评估。在这项研究中,我们评估了 RNN 模型 RETAIN 在 Cerner Health Facts® EMR 数据上的心力衰竭发病风险预测能力。我们的数据集包括来自近 400 家医院的超过 150,000 名心力衰竭患者和超过 1,000,000 名对照者。令人信服的是,RETAIN 在 AUC 方面的表现优于逻辑回归的 79%,达到 82%,这表明更具表达力的深度学习模型在 EHR 预测建模方面具有更强的能力。预测性能在不同的患者群体中波动,并且在不同的医院之间存在差异。此外,我们在单个医院上训练了 RETAIN 模型,并发现该模型可以应用于其他医院,其 AUC 仅降低约 3.6%。我们的结果表明 RNN 具有在大型和异构 EHR 数据上进行预测建模的能力,并为未来的改进铺平了道路。