Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, North Carolina, USA.
Department of Obstetrics and Gynecology, School of Medicine, University of North Carolina, Chapel Hill, North Carolina, USA.
Stat Med. 2023 Oct 15;42(23):4282-4298. doi: 10.1002/sim.9860. Epub 2023 Jul 31.
Inverse probability weighting can be used to correct for missing data. New estimators for the weights in the nonmonotone setting were introduced in 2018. These estimators are the unconstrained maximum likelihood estimator (UMLE) and the constrained Bayesian estimator (CBE), an alternative if UMLE fails to converge. In this work we describe and illustrate these estimators, and examine performance in simulation and in an applied example estimating the effect of anemia on spontaneous preterm birth in the Zambia Preterm Birth Prevention Study. We compare performance with multiple imputation (MI) and focus on the setting of an observational study where inverse probability of treatment weights are used to address confounding. In simulation, weighting was less statistically efficient at the smallest sample size and lowest exposure prevalence examined (n = 1500, 15% respectively) but in other scenarios statistical performance of weighting and MI was similar. Weighting had improved computational efficiency taking, on average, 0.4 and 0.05 times the time for MI in R and SAS, respectively. UMLE was easy to implement in commonly used software and convergence failure occurred just twice in >200 000 simulated cohorts making implementation of CBE unnecessary. In conclusion, weighting is an alternative to MI for nonmonotone missingness, though MI performed as well as or better in terms of bias and statistical efficiency. Weighting's superior computational efficiency may be preferred with large sample sizes or when using resampling algorithms. As validity of weighting and MI rely on correct specification of different models, both approaches could be implemented to check agreement of results.
逆概率加权可用于纠正缺失数据。2018 年提出了用于非单调情形的新权重估计量,即无约束极大似然估计量(UMLE)和约束贝叶斯估计量(CBE),如果 UMLE 无法收敛,则可以使用后者。在这项工作中,我们描述并说明了这些估计量,并在模拟和在赞比亚早产预防研究中估计贫血对自发性早产的影响的实际应用示例中检验了它们的性能。我们将其与多重插补(MI)进行了比较,并重点关注了使用治疗反概率权重来解决混杂的观察性研究设置。在模拟中,在最小样本量和最低暴露率(n=1500,分别为 15%)下,加权的统计效率较低,但在其他情况下,加权和 MI 的统计性能相似。加权的计算效率提高了,在 R 和 SAS 中分别平均需要 MI 时间的 0.4 和 0.05 倍。UMLE 易于在常用软件中实现,在超过 20 万次模拟队列中仅发生了两次收敛失败,因此不需要实现 CBE。总之,加权是 MI 用于非单调缺失的替代方法,尽管 MI 在偏差和统计效率方面的表现与加权一样好或更好。加权的计算效率优势可能在大样本量或使用重采样算法时更受欢迎。由于加权和 MI 的有效性取决于不同模型的正确指定,因此可以实现这两种方法来检查结果的一致性。