Department of Epidemiology and Biostatistics, Social Determinants of Health Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.
Department of Epidemiology, Research Center for Health Sciences, School of Health, Shiraz University of Medical Sciences, Shiraz, Iran.
J Epidemiol Glob Health. 2020 Mar;10(1):36-41. doi: 10.2991/jegh.k.191207.001.
This study was aimed to evaluate five Multiple Imputation (MI) methods in the context of STEP-wise Approach to Surveillance (STEPS) surveys.
We selected a complete subsample of STEPS survey data set and devised an experimental design consisted of 45 states (3 × 3 × 5), which differed by rate of simulated missing data, variable transformation, and MI method. In each state, the process of simulation of missing data and then MI were repeated 50 times. Evaluation was based on Relative Bias (RB) as well as five other measurements that were averaged over 50 repetitions.
In estimation of mean, Predictive Mean Matching (PMM) and Multiple Imputation by Chained Equation (MICE) could compensate for the nonresponse bias. Ln and Box-Cox (BC) transformation should be applied when the nonresponse rate reaches 40% and 60%, respectively. In estimation of proportion, PMM, MICE, bootstrap expectation maximization algorithm (BEM), and linear regression accompanied by BC transformation could correct for the nonresponse bias. Our findings show that even with 60% of nonresponse rate some of the MI methods could satisfactorily result in estimates with negligible RB.
Decision on MI method and variable transformation should be taken with caution. It is not possible to regard one method as totally the worst or the best and each method could outperform the others if it is applied in its right situation. Even in a certain situation, one method could be the best in terms of validity but the other method could be the best in terms of precision.
本研究旨在评估 STEP-wise 监测方法(STEPS)调查中五种多重插补(MI)方法。
我们选择了 STEPS 调查数据集的完整子样本,并设计了一个由 45 个州(3×3×5)组成的实验设计,这些州在模拟缺失数据的速率、变量转换和 MI 方法上有所不同。在每个州,模拟缺失数据和随后的 MI 的过程重复了 50 次。评估基于相对偏差(RB)以及在 50 次重复中平均的其他五个测量值。
在均值估计方面,预测均值匹配(PMM)和链式方程多重插补(MICE)可以补偿无应答偏差。当无应答率分别达到 40%和 60%时,应应用对数和 Box-Cox(BC)转换。在比例估计方面,PMM、MICE、bootstrap 期望最大化算法(BEM)和线性回归伴 BC 转换可以纠正无应答偏差。我们的发现表明,即使无应答率达到 60%,一些 MI 方法也可以产生偏差可忽略不计的估计值。
应谨慎选择 MI 方法和变量转换。不可能将一种方法完全视为最差或最好的方法,并且如果在正确的情况下应用,每种方法都可以优于其他方法。即使在某种情况下,一种方法在有效性方面可能是最好的,但另一种方法在精度方面可能是最好的。