J Biopharm Stat. 2022 Sep 3;32(5):717-739. doi: 10.1080/10543406.2021.2011898. Epub 2022 Jan 18.
The literature on dealing with missing covariates in nonrandomized studies advocates the use of sophisticated methods like multiple imputation (MI) and maximum likelihood (ML)-based approaches over simple methods. However, these methods are not necessarily optimal in terms of bias and efficiency of treatment effect estimation in randomized studies, where the covariate of interest (treatment group) is independent of all baseline (pre-randomization) covariates due to randomization. This has been shown in the literature, but only for missingness on a single baseline covariate. Here, we extend the situation to multiple baseline covariates with missingness and evaluate the performance of MI and ML compared with simple alternative methods under various missingness scenarios in RCTs with a quantitative outcome. We first derive asymptotic relative efficiencies of the simple methods under the missing completely at random (MCAR) scenario and then perform a simulation study for non-MCAR scenarios. Finally, a trial on chronic low back pain is used to illustrate the implementation of the methods. The results show that all simple methods give unbiased treatment effect estimation but with increased mean squared residual. It also turns out that mean imputation and the missing-indicator method are most efficient under all covariate missingness scenarios and perform at least as well as MI and LM in each scenario.
关于处理非随机研究中缺失协变量的文献提倡使用复杂的方法,如多重插补(MI)和基于最大似然(ML)的方法,而不是简单的方法。然而,在随机研究中,这些方法在处理效果估计的偏差和效率方面不一定是最优的,因为感兴趣的协变量(治疗组)由于随机化而与所有基线(随机化前)协变量独立。这在文献中已经得到了证明,但仅适用于单个基线协变量的缺失情况。在这里,我们将情况扩展到具有缺失的多个基线协变量,并在具有定量结果的 RCT 中评估 MI 和 ML 与简单替代方法在各种缺失情况下的性能。我们首先在完全随机缺失(MCAR)情况下推导出简单方法的渐近相对效率,然后在非 MCAR 情况下进行模拟研究。最后,使用慢性腰痛试验来说明方法的实施。结果表明,所有简单方法都能给出无偏的处理效果估计,但均方残差增加。事实证明,在所有协变量缺失情况下,均值插补和缺失指示法是最有效的,并且在每种情况下的表现至少与 MI 和 LM 一样好。