Suppr超能文献

多基因模型中缺失数据的插补方法。

Imputation methods for missing data for polygenic models.

作者信息

Fridley Brooke, Rabe Kari, de Andrade Mariza

机构信息

Department of Statistics, Iowa State University, Ames, Iowa, USA.

出版信息

BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S42. doi: 10.1186/1471-2156-4-S1-S42.

Abstract

Methods to handle missing data have been an area of statistical research for many years. Little has been done within the context of pedigree analysis. In this paper we present two methods for imputing missing data for polygenic models using family data. The imputation schemes take into account familial relationships and use the observed familial information for the imputation. A traditional multiple imputation approach and multiple imputation or data augmentation approach within a Gibbs sampler for the handling of missing data for a polygenic model are presented.We used both the Genetic Analysis Workshop 13 simulated missing phenotype and the complete phenotype data sets as the means to illustrate the two methods. We looked at the phenotypic trait systolic blood pressure and the covariate gender at time point 11 (1970) for Cohort 1 and time point 1 (1971) for Cohort 2. Comparing the results for three replicates of complete and missing data incorporating multiple imputation, we find that multiple imputation via a Gibbs sampler produces more accurate results. Thus, we recommend the Gibbs sampler for imputation purposes because of the ease with which it can be extended to more complicated models, the consistency of the results, and the accountability of the variation due to imputation.

摘要

多年来,处理缺失数据的方法一直是统计学研究的一个领域。在谱系分析的背景下,这方面的工作做得很少。在本文中,我们提出了两种利用家族数据对多基因模型中的缺失数据进行插补的方法。插补方案考虑了家族关系,并利用观察到的家族信息进行插补。本文提出了一种传统的多重插补方法以及吉布斯采样器中的多重插补或数据扩充方法,用于处理多基因模型中的缺失数据。我们使用遗传分析研讨会13模拟的缺失表型数据集和完整表型数据集来说明这两种方法。我们研究了队列1在时间点11(1970年)和队列2在时间点1(1971年)的表型特征收缩压和协变量性别。通过比较包含多重插补的完整数据和缺失数据的三个重复结果,我们发现通过吉布斯采样器进行多重插补能产生更准确的结果。因此,由于吉布斯采样器易于扩展到更复杂的模型、结果的一致性以及插补引起的变异的可解释性,我们推荐使用吉布斯采样器进行插补。

相似文献

1
Imputation methods for missing data for polygenic models.
BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S42. doi: 10.1186/1471-2156-4-S1-S42.
2
Missing phenotype data imputation in pedigree data analysis.
Genet Epidemiol. 2008 Jan;32(1):52-60. doi: 10.1002/gepi.20261.
3
Longitudinal variance components models for systolic blood pressure, fitted using Gibbs sampling.
BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S25. doi: 10.1186/1471-2156-4-S1-S25.
4
Multiple imputation methods for longitudinal blood pressure measurements from the Framingham Heart Study.
BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S43. doi: 10.1186/1471-2156-4-S1-S43.
5
Genetic Analysis Workshop 13: simulated longitudinal data on families for a system of oligogenic traits.
BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S3. doi: 10.1186/1471-2156-4-S1-S3.
6
Genome-wide linkage analysis of systolic blood pressure slope using the Genetic Analysis Workshop 13 data sets.
BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S86. doi: 10.1186/1471-2156-4-S1-S86.
8
Missing data and imputation: a practical illustration in a prognostic study on low back pain.
J Manipulative Physiol Ther. 2012 Jul;35(6):464-71. doi: 10.1016/j.jmpt.2012.07.002.
10
The multiple imputation method: a case study involving secondary data analysis.
Nurse Res. 2015 May;22(5):13-9. doi: 10.7748/nr.22.5.13.e1319.

引用本文的文献

1
Multiple imputation of missing phenotype data for QTL mapping.
Stat Appl Genet Mol Biol. 2011;10(1):Article 29. doi: 10.2202/1544-6115.1676.
2
Distinct developmental signatures of human abdominal and gluteal subcutaneous adipose tissue depots.
J Clin Endocrinol Metab. 2013 Jan;98(1):362-71. doi: 10.1210/jc.2012-2953. Epub 2012 Nov 12.
3
On normality, ethnicity, and missing values in quantitative trait locus mapping.
BMC Genet. 2005 Dec 30;6 Suppl 1(Suppl 1):S52. doi: 10.1186/1471-2156-6-S1-S52.
4
Multivariate variance-components analysis of longitudinal blood pressure measurements from the Framingham Heart Study.
BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S55. doi: 10.1186/1471-2156-4-S1-S55.
5
Multiple imputation methods for longitudinal blood pressure measurements from the Framingham Heart Study.
BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S43. doi: 10.1186/1471-2156-4-S1-S43.

本文引用的文献

1
Stochastic relaxation, gibbs distributions, and the bayesian restoration of images.
IEEE Trans Pattern Anal Mach Intell. 1984 Jun;6(6):721-41. doi: 10.1109/tpami.1984.4767596.
3
Methods to estimate genetic components of variance for quantitative traits in family studies.
Genet Epidemiol. 1999;17(1):64-76. doi: 10.1002/(SICI)1098-2272(1999)17:1<64::AID-GEPI5>3.0.CO;2-M.
4
Extensions to multivariate normal models for pedigree analysis.
Ann Hum Genet. 1982 Oct;46(4):373-83. doi: 10.1111/j.1469-1809.1982.tb01588.x.
6
Extensions to pedigree analysis. III. Variance components by the scoring method.
Ann Hum Genet. 1976 May;39(4):485-91. doi: 10.1111/j.1469-1809.1976.tb00156.x.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验