Goldstein Benjamin A, Pomann Gina Maria, Winkelmayer Wolfgang C, Pencina Michael J
Biostatistics and Bioinformatics, Duke University, 2424 Erwin Road, Durham, 27705, NC, U.S.A.
Center for Predictive Medicine, Duke Clinical Research Institute, Durham, NC, 27705, U.S.A.
Stat Med. 2017 Jul 30;36(17):2750-2763. doi: 10.1002/sim.7308. Epub 2017 May 2.
An increasingly important data source for the development of clinical risk prediction models is electronic health records (EHRs). One of their key advantages is that they contain data on many individuals collected over time. This allows one to incorporate more clinical information into a risk model. However, traditional methods for developing risk models are not well suited to these irregularly collected clinical covariates. In this paper, we compare a range of approaches for using longitudinal predictors in a clinical risk model. Using data from an EHR for patients undergoing hemodialysis, we incorporate five different clinical predictors into a risk model for patient mortality. We consider different approaches for treating the repeated measurements including use of summary statistics, machine learning methods, functional data analysis, and joint models. We follow up our empirical findings with a simulation study. Overall, our results suggest that simple approaches perform just as well, if not better, than more complex analytic approaches. These results have important implication for development of risk prediction models with EHRs. Copyright © 2017 John Wiley & Sons, Ltd.
电子健康记录(EHRs)是临床风险预测模型开发中一个日益重要的数据源。其关键优势之一在于它们包含了随时间收集的许多个体的数据。这使得人们能够将更多临床信息纳入风险模型。然而,传统的风险模型开发方法并不适合这些不规则收集的临床协变量。在本文中,我们比较了一系列在临床风险模型中使用纵向预测因子的方法。利用接受血液透析患者的电子健康记录数据,我们将五个不同的临床预测因子纳入患者死亡率风险模型。我们考虑了处理重复测量的不同方法,包括使用汇总统计、机器学习方法、功能数据分析和联合模型。我们通过模拟研究对实证结果进行了跟进。总体而言,我们的结果表明,简单方法即便不比更复杂的分析方法更好,至少也表现得一样好。这些结果对利用电子健康记录开发风险预测模型具有重要意义。版权所有© 2017约翰·威利父子有限公司。