多基因模型中缺失数据的插补方法。

Imputation methods for missing data for polygenic models.

作者信息

Fridley Brooke, Rabe Kari, de Andrade Mariza

机构信息

Department of Statistics, Iowa State University, Ames, Iowa, USA.

出版信息

BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S42. doi: 10.1186/1471-2156-4-S1-S42.

DOI:10.1186/1471-2156-4-S1-S42

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1866478/

Abstract

Methods to handle missing data have been an area of statistical research for many years. Little has been done within the context of pedigree analysis. In this paper we present two methods for imputing missing data for polygenic models using family data. The imputation schemes take into account familial relationships and use the observed familial information for the imputation. A traditional multiple imputation approach and multiple imputation or data augmentation approach within a Gibbs sampler for the handling of missing data for a polygenic model are presented.We used both the Genetic Analysis Workshop 13 simulated missing phenotype and the complete phenotype data sets as the means to illustrate the two methods. We looked at the phenotypic trait systolic blood pressure and the covariate gender at time point 11 (1970) for Cohort 1 and time point 1 (1971) for Cohort 2. Comparing the results for three replicates of complete and missing data incorporating multiple imputation, we find that multiple imputation via a Gibbs sampler produces more accurate results. Thus, we recommend the Gibbs sampler for imputation purposes because of the ease with which it can be extended to more complicated models, the consistency of the results, and the accountability of the variation due to imputation.

摘要

多年来，处理缺失数据的方法一直是统计学研究的一个领域。在谱系分析的背景下，这方面的工作做得很少。在本文中，我们提出了两种利用家族数据对多基因模型中的缺失数据进行插补的方法。插补方案考虑了家族关系，并利用观察到的家族信息进行插补。本文提出了一种传统的多重插补方法以及吉布斯采样器中的多重插补或数据扩充方法，用于处理多基因模型中的缺失数据。我们使用遗传分析研讨会13模拟的缺失表型数据集和完整表型数据集来说明这两种方法。我们研究了队列1在时间点11（1970年）和队列2在时间点1（1971年）的表型特征收缩压和协变量性别。通过比较包含多重插补的完整数据和缺失数据的三个重复结果，我们发现通过吉布斯采样器进行多重插补能产生更准确的结果。因此，由于吉布斯采样器易于扩展到更复杂的模型、结果的一致性以及插补引起的变异的可解释性，我们推荐使用吉布斯采样器进行插补。

相似文献

1

Imputation methods for missing data for polygenic models.多基因模型中缺失数据的插补方法。

BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S42. doi: 10.1186/1471-2156-4-S1-S42.

2

Missing phenotype data imputation in pedigree data analysis.系谱数据分析中的缺失表型数据插补

Genet Epidemiol. 2008 Jan;32(1):52-60. doi: 10.1002/gepi.20261.

3

Longitudinal variance components models for systolic blood pressure, fitted using Gibbs sampling.使用吉布斯抽样拟合的收缩压纵向方差成分模型。

BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S25. doi: 10.1186/1471-2156-4-S1-S25.

4

Multiple imputation methods for longitudinal blood pressure measurements from the Framingham Heart Study.来自弗雷明汉心脏研究的纵向血压测量的多重填补方法。

BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S43. doi: 10.1186/1471-2156-4-S1-S43.

5

Genetic Analysis Workshop 13: simulated longitudinal data on families for a system of oligogenic traits.遗传分析研讨会13：寡基因性状系统的家庭模拟纵向数据。

BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S3. doi: 10.1186/1471-2156-4-S1-S3.

6

Genome-wide linkage analysis of systolic blood pressure slope using the Genetic Analysis Workshop 13 data sets.利用遗传分析研讨会13数据集对收缩压斜率进行全基因组连锁分析。

BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S86. doi: 10.1186/1471-2156-4-S1-S86.

7

A comparison of multiple imputation methods for handling missing values in longitudinal data in the presence of a time-varying covariate with a non-linear association with time: a simulation study.存在与时间呈非线性关联的时变协变量时，用于处理纵向数据中缺失值的多种多重填补方法的比较：一项模拟研究。

BMC Med Res Methodol. 2017 Jul 25;17(1):114. doi: 10.1186/s12874-017-0372-y.

8

Missing data and imputation: a practical illustration in a prognostic study on low back pain.缺失数据与插补：腰痛预后研究中的实际例证

J Manipulative Physiol Ther. 2012 Jul;35(6):464-71. doi: 10.1016/j.jmpt.2012.07.002.

9

Nonlinear multiple imputation for continuous covariate within semiparametric Cox model: application to HIV data in Senegal.半参数 Cox 模型中连续协变量的非线性多重插补：在塞内加尔 HIV 数据中的应用。

Stat Med. 2013 Nov 20;32(26):4651-65. doi: 10.1002/sim.5854. Epub 2013 May 28.

10

The multiple imputation method: a case study involving secondary data analysis.多重填补法：一项涉及二次数据分析的案例研究。

Nurse Res. 2015 May;22(5):13-9. doi: 10.7748/nr.22.5.13.e1319.

引用本文的文献

1

Multiple imputation of missing phenotype data for QTL mapping.用于数量性状基因座定位的缺失表型数据的多重填补

Stat Appl Genet Mol Biol. 2011;10(1):Article 29. doi: 10.2202/1544-6115.1676.

2

Distinct developmental signatures of human abdominal and gluteal subcutaneous adipose tissue depots.人类腹部和臀部皮下脂肪组织的不同发育特征。

J Clin Endocrinol Metab. 2013 Jan;98(1):362-71. doi: 10.1210/jc.2012-2953. Epub 2012 Nov 12.

3

On normality, ethnicity, and missing values in quantitative trait locus mapping.关于数量性状位点作图中的常态、种族和缺失值。

BMC Genet. 2005 Dec 30;6 Suppl 1(Suppl 1):S52. doi: 10.1186/1471-2156-6-S1-S52.

4

Multivariate variance-components analysis of longitudinal blood pressure measurements from the Framingham Heart Study.弗雷明汉心脏研究中纵向血压测量的多变量方差成分分析。

BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S55. doi: 10.1186/1471-2156-4-S1-S55.

5

Multiple imputation methods for longitudinal blood pressure measurements from the Framingham Heart Study.来自弗雷明汉心脏研究的纵向血压测量的多重填补方法。

BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S43. doi: 10.1186/1471-2156-4-S1-S43.

本文引用的文献

1

Stochastic relaxation, gibbs distributions, and the bayesian restoration of images.随机松弛，吉布斯分布，以及贝叶斯图像恢复。

IEEE Trans Pattern Anal Mach Intell. 1984 Jun;6(6):721-41. doi: 10.1109/tpami.1984.4767596.

2

Multiple imputation for multivariate data with missing and below-threshold measurements: time-series concentrations of pollutants in the Arctic.针对存在缺失值和低于阈值测量值的多变量数据的多重填补：北极地区污染物的时间序列浓度

Biometrics. 2001 Mar;57(1):22-33. doi: 10.1111/j.0006-341x.2001.00022.x.

3

Methods to estimate genetic components of variance for quantitative traits in family studies.家庭研究中估计数量性状方差遗传成分的方法。

Genet Epidemiol. 1999;17(1):64-76. doi: 10.1002/(SICI)1098-2272(1999)17:1<64::AID-GEPI5>3.0.CO;2-M.

4

Extensions to multivariate normal models for pedigree analysis.用于系谱分析的多元正态模型的扩展

Ann Hum Genet. 1982 Oct;46(4):373-83. doi: 10.1111/j.1469-1809.1982.tb01588.x.

5

Pedigree analysis for quantitative traits: variance components without matrix inversion.数量性状的系谱分析：无需矩阵求逆的方差成分

Biometrics. 1990 Jun;46(2):399-413.

6

Extensions to pedigree analysis. III. Variance components by the scoring method.系谱分析的扩展。III. 评分法的方差成分

Ann Hum Genet. 1976 May;39(4):485-91. doi: 10.1111/j.1469-1809.1976.tb00156.x.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验