MRC Biostatistics Unit, Institute of Public Health, Robinson Way, Cambridge CB2 0SR, U.K..
Stat Med. 2011 Feb 20;30(4):377-99. doi: 10.1002/sim.4067. Epub 2010 Nov 30.
Multiple imputation by chained equations is a flexible and practical approach to handling missing data. We describe the principles of the method and show how to impute categorical and quantitative variables, including skewed variables. We give guidance on how to specify the imputation model and how many imputations are needed. We describe the practical analysis of multiply imputed data, including model building and model checking. We stress the limitations of the method and discuss the possible pitfalls. We illustrate the ideas using a data set in mental health, giving Stata code fragments.
多重链结方程插补是一种灵活实用的处理缺失数据的方法。我们描述了该方法的原理,并展示了如何插补分类变量和定量变量,包括偏态变量。我们给出了如何指定插补模型以及需要进行多少次插补的指导。我们描述了多重插补数据的实际分析,包括模型构建和模型检查。我们强调了该方法的局限性,并讨论了可能出现的陷阱。我们使用心理健康数据集来说明这些想法,并提供了 Stata 代码片段。