Sun Yifei, McCulloch Charles E, Marr Kieren A, Huang Chiung-Yu
Department of Biostatistics, Columbia University Mailman School of Public Health, New York, NY 10032.
Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158.
J Am Stat Assoc. 2021;116(534):594-604. doi: 10.1080/01621459.2020.1801447. Epub 2020 Aug 26.
Although increasingly used as a data resource for assembling cohorts, electronic health records (EHRs) pose many analytic challenges. In particular, a patient's health status influences when and what data are recorded, generating sampling bias in the collected data. In this paper, we consider recurrent event analysis using EHR data. Conventional regression methods for event risk analysis usually require the values of covariates to be observed throughout the follow-up period. In EHR databases, time-dependent covariates are intermittently measured during clinical visits, and the timing of these visits is informative in the sense that it depends on the disease course. Simple methods, such as the last-observation-carried-forward approach, can lead to biased estimation. On the other hand, complex joint models require additional assumptions on the covariate process and cannot be easily extended to handle multiple longitudinal predictors. By incorporating sampling weights derived from estimating the observation time process, we develop a novel estimation procedure based on inverse-rate-weighting and kernel-smoothing for the semiparametric proportional rate model of recurrent events. The proposed methods do not require model specifications for the covariate processes and can easily handle multiple time-dependent covariates. Our methods are applied to a kidney transplant study for illustration.
尽管电子健康记录(EHRs)越来越多地被用作组建队列的数据资源,但它带来了许多分析挑战。特别是,患者的健康状况会影响数据记录的时间和内容,从而在收集的数据中产生抽样偏差。在本文中,我们考虑使用EHR数据进行复发事件分析。用于事件风险分析的传统回归方法通常要求在整个随访期内观察协变量的值。在EHR数据库中,随时间变化的协变量在临床就诊期间是间歇性测量的,而且这些就诊的时间具有信息性,因为它取决于疾病进程。简单的方法,如末次观察结转法,可能会导致有偏差的估计。另一方面,复杂的联合模型需要对协变量过程做出额外假设,并且不容易扩展以处理多个纵向预测变量。通过纳入从估计观察时间过程中得出的抽样权重,我们为复发事件的半参数比例率模型开发了一种基于逆率加权和核平滑的新颖估计程序。所提出的方法不需要对协变量过程进行模型设定,并且可以轻松处理多个随时间变化的协变量。我们的方法应用于一项肾脏移植研究以作说明。