Department of Epidemiology and Preventive Medicine, Monash University, 99 Commercial Rd, Melbourne, VIC, Australia.
Clinical Epidemiology and Biostatistics Unit, Murdoch Childrens Research Institute, 50 Flemington Rd, Parkville, VIC, Australia.
Biostatistics. 2018 Oct 1;19(4):479-496. doi: 10.1093/biostatistics/kxx046.
Modern epidemiological studies collect data on time-varying individual-specific characteristics, such as body mass index and blood pressure. Incorporation of such time-dependent covariates in time-to-event models is of great interest, but raises some challenges. Of specific concern are measurement error, and the non-synchronous updating of covariates across individuals, due for example to missing data. It is well known that in the presence of either of these issues the last observation carried forward (LOCF) approach traditionally used leads to bias. Joint models of longitudinal and time-to-event outcomes, developed recently, address these complexities by specifying a model for the joint distribution of all processes and are commonly fitted by maximum likelihood or Bayesian approaches. However, the adequate specification of the full joint distribution can be a challenging modeling task, especially with multiple longitudinal markers. In fact, most available software packages are unable to handle more than one marker and offer a restricted choice of survival models. We propose a two-stage approach, Multiple Imputation for Joint Modeling (MIJM), to incorporate multiple time-dependent continuous covariates in the semi-parametric Cox and additive hazard models. Assuming a primary focus on the time-to-event model, the MIJM approach handles the joint distribution of the markers using multiple imputation by chained equations, a computationally convenient procedure that is widely available in mainstream statistical software. We developed an R package "survtd" that allows MIJM and other approaches in this manuscript to be applied easily, with just one call to its main function. A simulation study showed that MIJM performs well across a wide range of scenarios in terms of bias and coverage probability, particularly compared with LOCF, simpler two-stage approaches, and a Bayesian joint model. The Framingham Heart Study is used to illustrate the approach.
现代流行病学研究收集随时间变化的个体特定特征的数据,例如体重指数和血压。在生存模型中纳入这些时变协变量非常重要,但也带来了一些挑战。特别值得关注的是测量误差,以及由于数据缺失等原因导致协变量在个体之间的非同步更新。众所周知,在存在这些问题中的任何一个问题的情况下,传统上使用的最后一次观测值结转(LOCF)方法会导致偏差。最近开发的纵向和生存时间结果的联合模型通过指定所有过程的联合分布模型来解决这些复杂性,通常通过最大似然或贝叶斯方法进行拟合。然而,充分指定完整的联合分布可能是一个具有挑战性的建模任务,特别是对于多个纵向标记。事实上,大多数可用的软件包都无法处理多个标记,并且提供的生存模型选择有限。我们提出了一种两阶段方法,即联合建模的多重插补(MIJM),以在半参数 Cox 和加性风险模型中纳入多个时变连续协变量。假设主要关注生存时间模型,MIJM 方法使用链式方程的多重插补来处理标记的联合分布,这是一种计算方便的过程,在主流统计软件中广泛可用。我们开发了一个 R 包“survtd”,允许 MIJM 和本文中的其他方法通过其主函数的一次调用轻松应用。一项模拟研究表明,MIJM 在广泛的场景下在偏差和覆盖概率方面表现良好,特别是与 LOCF、更简单的两阶段方法和贝叶斯联合模型相比。弗雷明汉心脏研究用于说明该方法。