Atem Folefac D, Matsouaka Roland A, Zimmern Vincent E
Department of Biostatistics and Data Science, University of Texas Health Science Center at Houston, Houston, TX, USA.
Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA.
Biom J. 2019 Jul;61(4):1020-1032. doi: 10.1002/bimj.201800275. Epub 2019 Mar 25.
This paper deals with a Cox proportional hazards regression model, where some covariates of interest are randomly right-censored. While methods for censored outcomes have become ubiquitous in the literature, methods for censored covariates have thus far received little attention and, for the most part, dealt with the issue of limit-of-detection. For randomly censored covariates, an often-used method is the inefficient complete-case analysis (CCA) which consists in deleting censored observations in the data analysis. When censoring is not completely independent, the CCA leads to biased and spurious results. Methods for missing covariate data, including type I and type II covariate censoring as well as limit-of-detection do not readily apply due to the fundamentally different nature of randomly censored covariates. We develop a novel method for censored covariates using a conditional mean imputation based on either Kaplan-Meier estimates or a Cox proportional hazards model to estimate the effects of these covariates on a time-to-event outcome. We evaluate the performance of the proposed method through simulation studies and show that it provides good bias reduction and statistical efficiency. Finally, we illustrate the method using data from the Framingham Heart Study to assess the relationship between offspring and parental age of onset of cardiovascular events.
本文探讨了一种Cox比例风险回归模型,其中一些感兴趣的协变量存在随机右删失情况。虽然删失结局的方法在文献中已很常见,但删失协变量的方法迄今为止很少受到关注,并且在很大程度上是处理检测限问题。对于随机删失的协变量,一种常用的方法是效率低下的完整病例分析(CCA),即在数据分析中删除删失的观测值。当删失并非完全独立时,CCA会导致有偏差和虚假的结果。由于随机删失协变量的本质截然不同,包括I型和II型协变量删失以及检测限在内的缺失协变量数据方法并不容易适用。我们基于Kaplan-Meier估计或Cox比例风险模型开发了一种使用条件均值插补的删失协变量新方法,以估计这些协变量对事件发生时间结局的影响。我们通过模拟研究评估了所提出方法的性能,并表明它能有效减少偏差并提高统计效率。最后,我们使用弗雷明汉心脏研究的数据来说明该方法,以评估后代与父母心血管事件发病年龄之间的关系。