1 Service de Biostatistique et Information Médicale, Hôpital Saint-Louis, Paris, France.
2 Université Paris Diderot - Paris 7, Sorbonne Paris Cité, Paris, France.
Stat Methods Med Res. 2018 Jun;27(6):1634-1649. doi: 10.1177/0962280216666564. Epub 2016 Sep 19.
In multilevel settings such as individual participant data meta-analysis, a variable is 'systematically missing' if it is wholly missing in some clusters and 'sporadically missing' if it is partly missing in some clusters. Previously proposed methods to impute incomplete multilevel data handle either systematically or sporadically missing data, but frequently both patterns are observed. We describe a new multiple imputation by chained equations (MICE) algorithm for multilevel data with arbitrary patterns of systematically and sporadically missing variables. The algorithm is described for multilevel normal data but can easily be extended for other variable types. We first propose two methods for imputing a single incomplete variable: an extension of an existing method and a new two-stage method which conveniently allows for heteroscedastic data. We then discuss the difficulties of imputing missing values in several variables in multilevel data using MICE, and show that even the simplest joint multilevel model implies conditional models which involve cluster means and heteroscedasticity. However, a simulation study finds that the proposed methods can be successfully combined in a multilevel MICE procedure, even when cluster means are not included in the imputation models.
在多水平设置(如个体参与者数据荟萃分析)中,如果某个变量在某些簇中完全缺失,则称为“系统缺失”;如果在某些簇中部分缺失,则称为“随机缺失”。以前提出的用于插补不完全多水平数据的方法要么处理系统缺失数据,要么处理随机缺失数据,但通常两种模式都会出现。我们描述了一种新的多水平数据的链式方程多重插补(MICE)算法,用于处理具有系统和随机缺失变量的任意模式。该算法适用于多水平正态数据,但可以轻松扩展到其他变量类型。我们首先提出了两种插补单个不完全变量的方法:一种是现有方法的扩展,另一种是新的两阶段方法,方便处理异方差数据。然后,我们讨论了使用 MICE 插补多水平数据中多个变量的缺失值的困难,并表明即使是最简单的联合多水平模型也意味着包含簇均值和异方差性的条件模型。然而,一项模拟研究发现,即使在插补模型中不包括簇均值,所提出的方法也可以成功地组合在多水平 MICE 过程中。