Fridley Brooke, Rabe Kari, de Andrade Mariza
Department of Statistics, Iowa State University, Ames, Iowa, USA.
BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S42. doi: 10.1186/1471-2156-4-S1-S42.
Methods to handle missing data have been an area of statistical research for many years. Little has been done within the context of pedigree analysis. In this paper we present two methods for imputing missing data for polygenic models using family data. The imputation schemes take into account familial relationships and use the observed familial information for the imputation. A traditional multiple imputation approach and multiple imputation or data augmentation approach within a Gibbs sampler for the handling of missing data for a polygenic model are presented.We used both the Genetic Analysis Workshop 13 simulated missing phenotype and the complete phenotype data sets as the means to illustrate the two methods. We looked at the phenotypic trait systolic blood pressure and the covariate gender at time point 11 (1970) for Cohort 1 and time point 1 (1971) for Cohort 2. Comparing the results for three replicates of complete and missing data incorporating multiple imputation, we find that multiple imputation via a Gibbs sampler produces more accurate results. Thus, we recommend the Gibbs sampler for imputation purposes because of the ease with which it can be extended to more complicated models, the consistency of the results, and the accountability of the variation due to imputation.
多年来,处理缺失数据的方法一直是统计学研究的一个领域。在谱系分析的背景下,这方面的工作做得很少。在本文中,我们提出了两种利用家族数据对多基因模型中的缺失数据进行插补的方法。插补方案考虑了家族关系,并利用观察到的家族信息进行插补。本文提出了一种传统的多重插补方法以及吉布斯采样器中的多重插补或数据扩充方法,用于处理多基因模型中的缺失数据。我们使用遗传分析研讨会13模拟的缺失表型数据集和完整表型数据集来说明这两种方法。我们研究了队列1在时间点11(1970年)和队列2在时间点1(1971年)的表型特征收缩压和协变量性别。通过比较包含多重插补的完整数据和缺失数据的三个重复结果,我们发现通过吉布斯采样器进行多重插补能产生更准确的结果。因此,由于吉布斯采样器易于扩展到更复杂的模型、结果的一致性以及插补引起的变异的可解释性,我们推荐使用吉布斯采样器进行插补。