Yuan Ke-Hai, Yang-Wallentin Fan, Bentler Peter M
University of Notre Dame.
Uppsala University, Sweden.
Sociol Methods Res. 2012 Nov;41(4):598-629. doi: 10.1177/0049124112460373.
Normal-distribution-based maximum likelihood (ML) and multiple imputation (MI) are the two major procedures for missing data analysis. This article compares the two procedures with respects to bias and efficiency of parameter estimates. It also compares formula-based standard errors (SEs) for each procedure against the corresponding empirical SEs. The results indicate that parameter estimates by MI tend to be less efficient than those by ML; and the estimates of variance-covariance parameters by MI are also more biased. In particular, when the population for the observed variables possesses heavy tails, estimates of variance-covariance parameters by MI may contain severe bias even at relative large sample sizes. Although performing a lot better, ML parameter estimates may also contain substantial bias at smaller sample sizes. The results also indicate that, when the underlying population is close to normally distributed, SEs based on the sandwich-type covariance matrix and those based on the observed information matrix are very comparable to empirical SEs with either ML or MI. When the underlying distribution has heavier tails, SEs based on the sandwich-type covariance matrix for ML estimates are more reliable than those based on the observed information matrix. Both empirical results and analysis show that neither SEs based on the observed information matrix nor those based on the sandwich-type covariance matrix can provide consistent SEs in MI. Thus, ML is preferable to MI in practice, although parameter estimates by MI might still be consistent.
基于正态分布的极大似然法(ML)和多重填补法(MI)是缺失数据分析的两种主要方法。本文比较了这两种方法在参数估计偏差和效率方面的差异。同时,还将每种方法基于公式的标准误(SEs)与相应的经验标准误进行了比较。结果表明,MI的参数估计往往不如ML有效;而且MI对方差 - 协方差参数的估计偏差也更大。特别是,当观测变量的总体具有厚尾分布时,即使在相对大的样本量下,MI对方差 - 协方差参数的估计也可能存在严重偏差。虽然ML的参数估计表现要好得多,但在较小样本量时也可能存在较大偏差。结果还表明,当基础总体接近正态分布时,基于三明治型协方差矩阵的标准误和基于观测信息矩阵的标准误与ML或MI的经验标准误非常接近。当基础分布具有更厚的尾部时,基于三明治型协方差矩阵的ML估计标准误比基于观测信息矩阵的标准误更可靠。实证结果和分析均表明,在MI中,基于观测信息矩阵的标准误和基于三明治型协方差矩阵的标准误都不能提供一致的标准误。因此,在实际应用中,ML优于MI,尽管MI的参数估计可能仍然是一致的。