Department of Mathematics and Statistics, Amherst College, PO Box 5000, AC #2239, Amherst, 01002, MA, USA.
College of Public Health, The Ohio State University, Columbus, 43210, OH, USA.
BMC Med Res Methodol. 2020 Mar 30;20(1):72. doi: 10.1186/s12874-020-00948-6.
Random effects regression imputation has been recommended for multiple imputation (MI) in cluster randomized trials (CRTs) because it is congenial to analyses that use random effects regression. This method relies heavily on model assumptions and may not be robust to misspecification of the imputation model. MI by predictive mean matching (PMM) is a semiparametric alternative, but current software for multilevel data relies on imputation models that ignore clustering or use fixed effects for clusters. When used directly for imputation, these two models result in underestimation (ignoring clustering) or overestimation (fixed effects for clusters) of variance estimates.
We develop MI procedures based on PMM that leverage these opposing estimated biases in the variance estimates in one of three ways: weighting the distance metric (PMM-dist), weighting the average of the final imputed values from two PMM procedures (PMM-avg), or performing a weighted draw from the final imputed values from the two PMM procedures (PMM-draw). We use Monte-Carlo simulations to evaluate our newly proposed methods relative to established MI procedures, focusing on estimation of treatment group means and their variances after MI.
The proposed PMM procedures reduce the bias in the MI variance estimator relative to established methods when the imputation model is correctly specified, and are generally more robust to model misspecification than even the random effects imputation methods.
The PMM-draw procedure in particular is a promising method for multiply imputing missing data from CRTs that can be readily implemented in existing statistical software.
随机效应回归插补法已被推荐用于群组随机试验(CRT)中的多重插补(MI),因为它与使用随机效应回归的分析方法相吻合。这种方法严重依赖于模型假设,并且对于插补模型的不恰当指定可能不够稳健。预测均值匹配(PMM)的 MI 是一种半参数替代方法,但是当前用于多层次数据的软件依赖于忽略聚类或对聚类使用固定效应的插补模型。当直接用于插补时,这两个模型会导致方差估计的低估(忽略聚类)或高估(聚类的固定效应)。
我们开发了基于 PMM 的 MI 程序,这些程序利用方差估计中的这两种相反的估计偏差,有以下三种方式:加权距离度量(PMM-dist)、加权两个 PMM 程序的最终插补值的平均值(PMM-avg),或从两个 PMM 程序的最终插补值中进行加权抽取(PMM-draw)。我们使用蒙特卡罗模拟来评估我们新提出的方法与已建立的 MI 程序的相对性能,重点是 MI 后处理组均值及其方差的估计。
当插补模型正确指定时,相对于已建立的方法,提出的 PMM 程序可减少 MI 方差估计器的偏差,并且通常比甚至随机效应插补方法更能抵抗模型的不恰当指定。
特别是 PMM-draw 程序是一种很有前途的方法,可用于从 CRT 中多次插补缺失数据,并且可以在现有的统计软件中轻松实现。