1 MRC Biostatistics Unit, Institute of Public Health, Cambridge, UK.
2 School of Social and Community Medicine, University of Bristol, Bristol, UK.
Stat Methods Med Res. 2018 Jun;27(6):1603-1614. doi: 10.1177/0962280216665872. Epub 2016 Sep 5.
Estimating the parameters of a regression model of interest is complicated by missing data on the variables in that model. Multiple imputation is commonly used to handle these missing data. Joint model multiple imputation and full-conditional specification multiple imputation are known to yield imputed data with the same asymptotic distribution when the conditional models of full-conditional specification are compatible with that joint model. We show that this asymptotic equivalence of imputation distributions does not imply that joint model multiple imputation and full-conditional specification multiple imputation will also yield asymptotically equally efficient inference about the parameters of the model of interest, nor that they will be equally robust to misspecification of the joint model. When the conditional models used by full-conditional specification multiple imputation are linear, logistic and multinomial regressions, these are compatible with a restricted general location joint model. We show that multiple imputation using the restricted general location joint model can be substantially more asymptotically efficient than full-conditional specification multiple imputation, but this typically requires very strong associations between variables. When associations are weaker, the efficiency gain is small. Moreover, full-conditional specification multiple imputation is shown to be potentially much more robust than joint model multiple imputation using the restricted general location model to mispecification of that model when there is substantial missingness in the outcome variable.
当模型中的变量存在缺失数据时,估计感兴趣的回归模型的参数会变得复杂。常用的处理方法是多重插补。当全条件指定模型的条件模型与联合模型兼容时,已知联合模型多重插补和全条件指定多重插补会产生具有相同渐近分布的插补数据。我们表明,这种插补分布的渐近等价性并不意味着联合模型多重插补和全条件指定多重插补也将对感兴趣的模型参数进行渐近等效的推断,也不意味着它们对联合模型的指定不匹配具有相同的稳健性。当全条件指定多重插补使用的条件模型是线性、逻辑和多项回归时,这些与受限广义位置联合模型兼容。我们表明,使用受限广义位置联合模型进行多重插补可以在渐近效率上大大优于全条件指定多重插补,但这通常需要变量之间非常强的关联。当关联较弱时,效率增益很小。此外,当因变量存在大量缺失值时,全条件指定多重插补显示出比使用受限广义位置模型的联合模型多重插补具有更大的潜在稳健性。