Tampere School of Public Health, University of Tampere, Finland.
Stat Med. 2009 Dec 20;28(29):3657-69. doi: 10.1002/sim.3731.
Multiple imputation (MI) has increasingly received attention as a flexible tool to resolve missing data problems both in observational and controlled studies. Our goal has been to develop a valid and efficient MI procedure for the Diabetes Prediction and Prevention Nutrition Study, in which the diet of a cohort of newborn children with HLA-DQB1-conferred susceptibility to type 1 diabetes is repeatedly measured by 3-day food records over early childhood. The estimation of risk is based on a nested case-control design setup within the cohort. We have used an iterative procedure known as the fully conditional specification (FCS) to generate appropriate values for the missing dietary data, here playing the role of time-dependent covariates. Our method extends the standard FCS to repeated measurements settings with the possibility of non-monotone missingness patterns by being doubly iterative over the follow-up time of the individuals. In addition, our proposed procedure is nonparametric in the sense that the variables can have distributions deviating strongly from normality: it makes use of quantile normal scores to transform to normality, performs imputations, and transforms back to the original scale. By the use of a moving time window and stepwise regression procedures, the two-fold FCS method operates well with a great number of variables each measured repeatedly over time. Extensive simulation studies demonstrate that the procedure together with the proposed transformations and variable selection methods provides tools for valid and efficient statistical inference in the nested case-control setting, and its applications extend beyond that.
多重插补(MI)作为一种灵活的工具,越来越受到关注,可以解决观察性和对照研究中缺失数据的问题。我们的目标是为糖尿病预测和预防营养研究开发一种有效的 MI 程序,在该研究中,通过 3 天的食物记录,对具有 1 型糖尿病 HLA-DQB1 易感性的队列中新生儿的饮食进行多次测量。风险的估计是基于队列内嵌套病例对照设计的设置。我们使用了一种称为完全条件规范(FCS)的迭代过程,为缺失的饮食数据生成适当的值,这里缺失的数据扮演着随时间变化的协变量的角色。我们的方法通过对个体的随访时间进行双重迭代,将标准 FCS 扩展到具有非单调缺失模式的重复测量设置中。此外,我们提出的方法在非参数意义上是参数的,即变量的分布可以强烈偏离正态性:它利用分位数正态得分将数据转换为正态性,进行插补,并转换回原始尺度。通过使用移动时间窗口和逐步回归过程,双折 FCS 方法在大量随时间重复测量的变量上运行良好。广泛的模拟研究表明,该方法以及提出的变换和变量选择方法为嵌套病例对照设置中的有效和高效统计推断提供了工具,并且其应用范围超出了这一点。