Clinical Epidemiology and Biostatistics Unit, Murdoch Childrens Research Institute, Royal Children's Hospital, Flemington Road, Parkville, Victoria 3052, Australia.
Am J Epidemiol. 2010 Mar 1;171(5):624-32. doi: 10.1093/aje/kwp425. Epub 2010 Jan 27.
Statistical analysis in epidemiologic studies is often hindered by missing data, and multiple imputation is increasingly being used to handle this problem. In a simulation study, the authors compared 2 methods for imputation that are widely available in standard software: fully conditional specification (FCS) or "chained equations" and multivariate normal imputation (MVNI). The authors created data sets of 1,000 observations to simulate a cohort study, and missing data were induced under 3 missing-data mechanisms. Imputations were performed using FCS (Royston's "ice") and MVNI (Schafer's NORM) in Stata (Stata Corporation, College Station, Texas), with transformations or prediction matching being used to manage nonnormality in the continuous variables. Inferences for a set of regression parameters were compared between these approaches and a complete-case analysis. As expected, both FCS and MVNI were generally less biased than complete-case analysis, and both produced similar results despite the presence of binary and ordinal variables that clearly did not follow a normal distribution. Ignoring skewness in a continuous covariate led to large biases and poor coverage for the corresponding regression parameter under both approaches, although inferences for other parameters were largely unaffected. These results provide reassurance that similar results can be expected from FCS and MVNI in a standard regression analysis involving variously scaled variables.
在流行病学研究中,统计分析常常受到缺失数据的阻碍,而多重插补越来越多地被用于处理这个问题。在一项模拟研究中,作者比较了两种广泛应用于标准软件的插补方法:完全条件指定(FCS)或“链式方程”和多元正态插补(MVNI)。作者创建了 1000 个观测值的数据集,以模拟队列研究,并在 3 种缺失数据机制下诱导缺失数据。使用 Stata(Stata Corporation,德克萨斯州College Station)中的 FCS(Royston 的“ice”)和 MVNI(Schafer 的 NORM)进行插补,并使用变换或预测匹配来管理连续变量中的非正态性。在这些方法和完整案例分析之间比较了一组回归参数的推断。正如预期的那样,FCS 和 MVNI 通常比完整案例分析的偏差更小,尽管存在明显不符合正态分布的二进制和有序变量,但两种方法都产生了相似的结果。在这两种方法下,忽略连续协变量的偏度都会导致相应回归参数的大偏差和较差的覆盖率,尽管其他参数的推断基本上不受影响。这些结果提供了保证,即在涉及各种比例变量的标准回归分析中,FCS 和 MVNI 可以产生类似的结果。
Am J Epidemiol. 2010-9-14
Rev Epidemiol Sante Publique. 2009-10
Stat Methods Med Res. 2007-6
BMC Med Res Methodol. 2017-9-6
Biom J. 2009-8
Res Social Adm Pharm. 2007-3
JAMA Netw Open. 2025-7-1
CPT Pharmacometrics Syst Pharmacol. 2025-6
Am J Kidney Dis. 2025-3-5