Quantitative Sciences Unit, Department of Medicine, Stanford University, Palo Alto, California.
Division of Nephrology, Department of Medicine, Stanford University, Palo Alto, California.
Stat Med. 2019 Jul 30;38(17):3204-3220. doi: 10.1002/sim.8174. Epub 2019 May 17.
The treatment of missing data in comparative effectiveness studies with right-censored outcomes and time-varying covariates is challenging because of the multilevel structure of the data. In particular, the performance of an accessible method like multiple imputation (MI) under an imputation model that ignores the multilevel structure is unknown and has not been compared to complete-case (CC) and single imputation methods that are most commonly applied in this context. Through an extensive simulation study, we compared statistical properties among CC analysis, last value carried forward, mean imputation, the use of missing indicators, and MI-based approaches with and without auxiliary variables under an extended Cox model when the interest lies in characterizing relationships between non-missing time-varying exposures and right-censored outcomes. MI demonstrated favorable properties under a moderate missing-at-random condition (absolute bias <0.1) and outperformed CC and single imputation methods, even when the MI method did not account for correlated observations in the imputation model. The performance of MI decreased with increasing complexity such as when the missing data mechanism involved the exposure of interest, but was still preferred over other methods considered and performed well in the presence of strong auxiliary variables. We recommend considering MI that ignores the multilevel structure in the imputation model when data are missing in a time-varying confounder, incorporating variables associated with missingness in the MI models as well as conducting sensitivity analyses across plausible assumptions.
在存在右删失结局和时变协变量的比较效果研究中,处理缺失数据是一项具有挑战性的任务,因为数据具有多层次结构。特别是,在忽略多层次结构的插补模型下,一种可访问的方法(如多重插补(MI))的性能是未知的,并且尚未与最常用于这种情况下的完全案例(CC)和单插补方法进行比较。通过广泛的模拟研究,当我们关注于描述非缺失时变暴露与右删失结局之间的关系时,我们比较了在扩展 Cox 模型下,CC 分析、末次结转、均值插补、缺失指标的使用以及有无辅助变量的 MI 方法之间的统计特性。在适度的随机缺失条件下(绝对偏差 <0.1),MI 表现出良好的特性,并且优于 CC 和单插补方法,即使 MI 方法在插补模型中没有考虑相关观测。随着缺失数据机制涉及感兴趣的暴露等复杂性的增加,MI 的性能会下降,但仍优于其他考虑的方法,并且在存在强辅助变量的情况下表现良好。我们建议在时间变异性混杂因素中存在缺失数据时,考虑在插补模型中忽略多层次结构的 MI,在 MI 模型中纳入与缺失相关的变量,并进行合理假设下的敏感性分析。