使用大型且异构的 EHR 数据集研究基于递归神经网络的心力衰竭发作风险预测模型的可推广性。

A study of generalizability of recurrent neural network-based predictive models for heart failure onset risk using a large and heterogeneous EHR data set.

机构信息

School of Biomedical Informatics, University of Texas Health Science Center at Houston (UTHealth), Houston, TX, United States.

Department of Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States.

出版信息

J Biomed Inform. 2018 Aug;84:11-16. doi: 10.1016/j.jbi.2018.06.011. Epub 2018 Jun 15.

DOI:10.1016/j.jbi.2018.06.011

PMID:29908902

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6076336/

Abstract

Recently, recurrent neural networks (RNNs) have been applied in predicting disease onset risks with Electronic Health Record (EHR) data. While these models demonstrated promising results on relatively small data sets, the generalizability and transferability of those models and its applicability to different patient populations across hospitals have not been evaluated. In this study, we evaluated an RNN model, RETAIN, over Cerner Health Facts® EMR data, for heart failure onset risk prediction. Our data set included over 150,000 heart failure patients and over 1,000,000 controls from nearly 400 hospitals. Convincingly, RETAIN achieved an AUC of 82% in comparison to an AUC of 79% for logistic regression, demonstrating the power of more expressive deep learning models for EHR predictive modeling. The prediction performance fluctuated across different patient groups and varied from hospital to hospital. Also, we trained RETAIN models on individual hospitals and found that the model can be applied to other hospitals with only about 3.6% of reduction of AUC. Our results demonstrated the capability of RNN for predictive modeling with large and heterogeneous EHR data, and pave the road for future improvements.

摘要

最近，递归神经网络 (RNN) 已应用于通过电子健康记录 (EHR) 数据预测疾病发病风险。虽然这些模型在相对较小的数据集上表现出了有前景的结果，但这些模型的泛化能力和可转移性及其在不同医院的不同患者群体中的适用性尚未得到评估。在这项研究中，我们评估了 RNN 模型 RETAIN 在 Cerner Health Facts® EMR 数据上的心力衰竭发病风险预测能力。我们的数据集包括来自近 400 家医院的超过 150,000 名心力衰竭患者和超过 1,000,000 名对照者。令人信服的是，RETAIN 在 AUC 方面的表现优于逻辑回归的 79%，达到 82%，这表明更具表达力的深度学习模型在 EHR 预测建模方面具有更强的能力。预测性能在不同的患者群体中波动，并且在不同的医院之间存在差异。此外，我们在单个医院上训练了 RETAIN 模型，并发现该模型可以应用于其他医院，其 AUC 仅降低约 3.6%。我们的结果表明 RNN 具有在大型和异构 EHR 数据上进行预测建模的能力，并为未来的改进铺平了道路。

相似文献

A study of generalizability of recurrent neural network-based predictive models for heart failure onset risk using a large and heterogeneous EHR data set.使用大型且异构的 EHR 数据集研究基于递归神经网络的心力衰竭发作风险预测模型的可推广性。

J Biomed Inform. 2018 Aug;84:11-16. doi: 10.1016/j.jbi.2018.06.011. Epub 2018 Jun 15.

Developing EHR-driven heart failure risk prediction models using CPXR(Log) with the probabilistic loss function.使用CPXR(Log)和概率损失函数开发由电子健康记录驱动的心力衰竭风险预测模型。

J Biomed Inform. 2016 Apr;60:260-9. doi: 10.1016/j.jbi.2016.01.009. Epub 2016 Feb 1.

Predicting post-stroke pneumonia using deep neural network approaches.使用深度神经网络方法预测卒中后肺炎。

Int J Med Inform. 2019 Dec;132:103986. doi: 10.1016/j.ijmedinf.2019.103986. Epub 2019 Oct 1.

Using neural attention networks to detect adverse medical events from electronic health records.利用神经注意力网络从电子健康记录中检测不良医疗事件。

J Biomed Inform. 2018 Nov;87:118-130. doi: 10.1016/j.jbi.2018.10.002. Epub 2018 Oct 15.

Using recurrent neural network models for early detection of heart failure onset.使用循环神经网络模型进行心力衰竭发作的早期检测。

J Am Med Inform Assoc. 2017 Mar 1;24(2):361-370. doi: 10.1093/jamia/ocw112.

Marrying Medical Domain Knowledge With Deep Learning on Electronic Health Records: A Deep Visual Analytics Approach.将医学领域知识与电子健康记录上的深度学习相结合：一种深度可视化分析方法。

J Med Internet Res. 2020 Sep 28;22(9):e20645. doi: 10.2196/20645.

LSTM Model for Prediction of Heart Failure in Big Data.基于大数据的心力衰竭预测 LSTM 模型

J Med Syst. 2019 Mar 19;43(5):111. doi: 10.1007/s10916-019-1243-3.

Deep learning predicts extreme preterm birth from electronic health records.深度学习从电子健康记录预测极早产。

J Biomed Inform. 2019 Dec;100:103334. doi: 10.1016/j.jbi.2019.103334. Epub 2019 Oct 31.

Deep Diabetologist: Learning to Prescribe Hypoglycemic Medications with Recurrent Neural Networks.深度糖尿病专家：利用循环神经网络学习开具降糖药物处方

Stud Health Technol Inform. 2017;245:1277.

Recurrent Neural Networks for Early Detection of Heart Failure From Longitudinal Electronic Health Record Data: Implications for Temporal Modeling With Respect to Time Before Diagnosis, Data Density, Data Quantity, and Data Type.基于纵向电子健康记录数据的循环神经网络用于心力衰竭的早期检测：关于诊断前时间、数据密度、数据量和数据类型的时间建模的意义

Circ Cardiovasc Qual Outcomes. 2019 Oct;12(10):e005114. doi: 10.1161/CIRCOUTCOMES.118.005114. Epub 2019 Oct 15.

引用本文的文献

Multimodal integration strategies for clinical application in oncology.肿瘤学临床应用中的多模态整合策略

Front Pharmacol. 2025 Aug 20;16:1609079. doi: 10.3389/fphar.2025.1609079. eCollection 2025.

Gated recurrent unit with decay has real-time capability for postoperative ileus surveillance and offers cross-hospital transferability.具有衰减功能的门控循环单元具有术后肠梗阻监测的实时能力，并具备跨医院的可转移性。

Commun Med (Lond). 2025 Aug 4;5(1):331. doi: 10.1038/s43856-025-01053-9.

Narrative Feature or Structured Feature? A Study of Large Language Models to Identify Cancer Patients at Risk of Heart Failure.叙事特征还是结构化特征？一项关于大型语言模型识别有心力衰竭风险癌症患者的研究。

AMIA Annu Symp Proc. 2025 May 22;2024:242-251. eCollection 2024.

Doubly Robust Augmented Model Accuracy Transfer Inference with High Dimensional Features.具有高维特征的双稳健增强模型精度转移推断

J Am Stat Assoc. 2025;120(549):524-534. doi: 10.1080/01621459.2024.2356291. Epub 2024 Jun 24.

Adaptable graph neural networks design to support generalizability for clinical event prediction.支持临床事件预测通用性的自适应图神经网络设计。

J Biomed Inform. 2025 Mar;163:104794. doi: 10.1016/j.jbi.2025.104794. Epub 2025 Feb 15.

Statistical Inference for Maximin Effects: Identifying Stable Associations across Multiple Studies.最大最小效应的统计推断：识别多项研究中的稳定关联。

J Am Stat Assoc. 2024;119(547):1968-1984. doi: 10.1080/01621459.2023.2233162. Epub 2023 Aug 4.

GPU Accelerated Estimation of a Shared Random Effect Joint Model for Dynamic Prediction.用于动态预测的共享随机效应联合模型的GPU加速估计

Comput Stat Data Anal. 2022 Oct;174. doi: 10.1016/j.csda.2022.107528. Epub 2022 May 16.

Deep learning-based prediction of one-year mortality in Finland is an accurate but unfair aging marker.基于深度学习的芬兰一年死亡率预测是一种准确但不公平的衰老标志物。

Nat Aging. 2024 Jul;4(7):1014-1027. doi: 10.1038/s43587-024-00657-5. Epub 2024 Jun 24.

Pitfalls in Developing Machine Learning Models for Predicting Cardiovascular Diseases: Challenge and Solutions.机器学习模型在预测心血管疾病中的陷阱：挑战与解决方案。

J Med Internet Res. 2024 Jul 26;26:e47645. doi: 10.2196/47645.

Revolutionizing Postoperative Ileus Monitoring: Exploring GRU-D's Real-Time Capabilities and Cross-Hospital Transferability.革新术后肠梗阻监测：探索门控循环单元-衰减（GRU-D）的实时能力及跨医院可转移性。

medRxiv. 2024 Apr 25:2024.04.24.24306295. doi: 10.1101/2024.04.24.24306295.

本文引用的文献

Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis.深度电子健康记录（EHR）：深度学习技术在电子健康记录（EHR）分析中的最新进展综述。

IEEE J Biomed Health Inform. 2018 Sep;22(5):1589-1604. doi: 10.1109/JBHI.2017.2767063. Epub 2017 Oct 27.

Deep learning for healthcare: review, opportunities and challenges.深度学习在医疗保健领域的应用：综述、机遇与挑战。

Brief Bioinform. 2018 Nov 27;19(6):1236-1246. doi: 10.1093/bib/bbx044.

Using recurrent neural network models for early detection of heart failure onset.使用循环神经网络模型进行心力衰竭发作的早期检测。

J Am Med Inform Assoc. 2017 Mar 1;24(2):361-370. doi: 10.1093/jamia/ocw112.

Validation and Comparison of Seven Mortality Prediction Models for Hospitalized Patients With Acute Decompensated Heart Failure.七种急性失代偿性心力衰竭住院患者死亡率预测模型的验证与比较

Circ Heart Fail. 2016 Aug;9(8). doi: 10.1161/CIRCHEARTFAILURE.115.002912.

Drug-Drug Interaction Associated with Mold-Active Triazoles among Hospitalized Patients.住院患者中与霉菌活性三唑类药物相关的药物相互作用

Antimicrob Agents Chemother. 2016 May 23;60(6):3398-406. doi: 10.1128/AAC.00054-16. Print 2016 Jun.

Deep learning.深度学习。

Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.

Deep learning in neural networks: an overview.神经网络中的深度学习：综述。

Neural Netw. 2015 Jan;61:85-117. doi: 10.1016/j.neunet.2014.09.003. Epub 2014 Oct 13.

Contemporary prevalence and correlates of incident heart failure with preserved ejection fraction.当代射血分数保留型心力衰竭的患病率及相关因素。

Am J Med. 2013 May;126(5):393-400. doi: 10.1016/j.amjmed.2012.10.022. Epub 2013 Mar 14.

Assessing the generalizability of prognostic information.评估预后信息的可推广性。

Ann Intern Med. 1999 Mar 16;130(6):515-24. doi: 10.7326/0003-4819-130-6-199903160-00016.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验