Bartlett Jonathan W, Hughes Rachael A
Department of Mathematical Sciences, University of Bath, Bath, UK.
Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK.
Stat Methods Med Res. 2020 Dec;29(12):3533-3546. doi: 10.1177/0962280220932189. Epub 2020 Jun 30.
Multiple imputation has become one of the most popular approaches for handling missing data in statistical analyses. Part of this success is due to Rubin's simple combination rules. These give frequentist valid inferences when the imputation and analysis procedures are so-called congenial and the embedding model is correctly specified, but otherwise may not. Roughly speaking, congeniality corresponds to whether the imputation and analysis models make different assumptions about the data. In practice, imputation models and analysis procedures are often not congenial, such that tests may not have the correct size, and confidence interval coverage deviates from the advertised level. We examine a number of recent proposals which combine bootstrapping with multiple imputation and determine which are valid under uncongeniality and model misspecification. Imputation followed by bootstrapping generally does not result in valid variance estimates under uncongeniality or misspecification, whereas certain bootstrap followed by imputation methods do. We recommend a particular computationally efficient variant of bootstrapping followed by imputation.
多重填补已成为统计分析中处理缺失数据最流行的方法之一。这种成功部分归功于鲁宾的简单合并规则。当填补和分析程序是所谓的相合的且嵌入模型被正确设定时,这些规则能给出频率主义有效的推断,但否则可能不行。大致来说,相合性对应于填补模型和分析模型对数据是否做出不同假设。在实践中,填补模型和分析程序常常不相合,以至于检验可能没有正确的规模,并且置信区间覆盖度偏离所宣称的水平。我们研究了一些最近将自助法与多重填补相结合的提议,并确定哪些在不相合性和模型误设情况下是有效的。在不相合性或误设情况下,先进行填补然后进行自助法通常不会产生有效的方差估计,而某些先进行自助法然后进行填补的方法则可以。我们推荐一种特别的计算效率高的先进行自助法然后进行填补的变体方法。