Yucel Recai M
Department of Epidemiology and Biostatistics, School of Public Health, University at Albany, SUNY.
Stat Modelling. 2011 Aug;11(4):351-370. doi: 10.1177/1471082X1001100404.
Principled techniques for incomplete-data problems are increasingly part of mainstream statistical practice. Among many proposed techniques so far, inference by multiple imputation (MI) has emerged as one of the most popular. While many strategies leading to inference by MI are available in cross-sectional settings, the same richness does not exist in multilevel applications. The limited methods available for multilevel applications rely on the multivariate adaptations of mixed-effects models. This approach preserves the mean structure across clusters and incorporates distinct variance components into the imputation process. In this paper, I add to these methods by considering a random covariance structure and develop computational algorithms. The attraction of this new imputation modeling strategy is to correctly reflect the mean and variance structure of the joint distribution of the data, and allow the covariances differ across the clusters. Using Markov Chain Monte Carlo techniques, a predictive distribution of missing data given observed data is simulated leading to creation of multiple imputations. To circumvent the large sample size requirement to support independent covariance estimates for the level-1 error term, I consider distributional impositions mimicking random-effects distributions assigned a priori. These techniques are illustrated in an example exploring relationships between victimization and individual and contextual level factors that raise the risk of violent crime.
处理不完全数据问题的原则性技术越来越成为主流统计实践的一部分。在迄今为止提出的众多技术中,多重填补法(MI)推断已成为最受欢迎的方法之一。虽然在横断面研究中,有许多可用于通过MI进行推断的策略,但在多层次应用中却没有同样丰富的方法。现有的用于多层次应用的有限方法依赖于混合效应模型的多变量调整。这种方法保留了各聚类间的均值结构,并将不同的方差成分纳入填补过程。在本文中,我通过考虑随机协方差结构并开发计算算法,对这些方法进行了补充。这种新的填补建模策略的吸引力在于能正确反映数据联合分布的均值和方差结构,并允许各聚类间的协方差有所不同。利用马尔可夫链蒙特卡罗技术,模拟了给定观测数据时缺失数据的预测分布,从而生成多个填补值。为了规避支持对一级误差项进行独立协方差估计所需的大样本量要求,我考虑了模仿先验分配的随机效应分布的分布设定。这些技术在一个探索受害情况与增加暴力犯罪风险的个人及背景层面因素之间关系的例子中得到了说明。