Suppr超能文献

混合人群中未分型标记的推断的实用考虑。

Practical considerations for imputation of untyped markers in admixed populations.

机构信息

Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892-5635, USA.

出版信息

Genet Epidemiol. 2010 Apr;34(3):258-65. doi: 10.1002/gepi.20457.

Abstract

Imputation of genotypes for markers untyped in a study sample has become a standard approach to increase genome coverage in genome-wide association studies at practically zero cost. Most methods for imputing missing genotypes extend previously described algorithms for inferring haplotype phase. These algorithms generally fall into three classes based on the underlying model for estimating the conditional distribution of haplotype frequencies: a cluster-based model, a multinomial model, or a population genetics-based model. We compared BEAGLE, PLINK, and MACH, representing the three classes of models, respectively, with specific attention to measures of imputation success and selection of the reference panel for an admixed study sample of African Americans. Based on analysis of chromosome 22 and after calibration to a fixed level of 90% concordance between experimentally determined and imputed genotypes, MACH yielded the largest absolute number of successfully imputed markers and the largest gain in coverage of the variation captured by HapMap reference panels. Following the common practice of performing imputation once, the Yoruba in Ibadan, Nigeria (YRI) reference panel outperformed other HapMap reference panels, including (1) African ancestry from Southwest USA (ASW) data, (2) an unweighted combination of the Northern and Western Europe (CEU) and YRI data into a single reference panel, and (3) a combination of the CEU and YRI data into a single reference panel with weights matching estimates of admixture proportions. For our admixed study sample, the optimal strategy involved imputing twice with the HapMap CEU and YRI reference panels separately and then merging the data sets.

摘要

在研究样本中对未分型标记进行基因型推断已成为一种增加全基因组关联研究中基因组覆盖度的标准方法,几乎不需要任何成本。大多数用于推断缺失基因型的方法扩展了先前描述的推断单倍型相位的算法。这些算法通常基于估计单倍型频率条件分布的基础模型分为三类:基于聚类的模型、多项模型或基于群体遗传学的模型。我们比较了分别代表这三种模型的 BEAGLE、PLINK 和 MACH,特别关注了推断成功率和选择参考面板的措施,以适应非洲裔美国人的混合研究样本。基于对 22 号染色体的分析,并在与实验确定的基因型和推断的基因型之间 90%的一致性的固定水平校准后,MACH 产生了最多数量的成功推断标记和最多的 HapMap 参考面板捕获的变异覆盖度增益。在执行一次推断的常见实践之后,尼日利亚伊巴丹的约鲁巴人(YRI)参考面板优于其他 HapMap 参考面板,包括(1)来自美国西南部的非洲裔(ASW)数据,(2)将北欧和西欧(CEU)和 YRI 数据加权组合成一个单一的参考面板,以及(3)将 CEU 和 YRI 数据组合成一个参考面板,其权重与混合比例的估计值匹配。对于我们的混合研究样本,最佳策略涉及两次使用 HapMap CEU 和 YRI 参考面板进行推断,然后合并数据集。

相似文献

3
9
MaCH-admix: genotype imputation for admixed populations.MaCH-admix:混合人群的基因型推断。
Genet Epidemiol. 2013 Jan;37(1):25-37. doi: 10.1002/gepi.21690. Epub 2012 Oct 16.

引用本文的文献

6

本文引用的文献

4
A comprehensive evaluation of SNP genotype imputation.单核苷酸多态性(SNP)基因型填充的综合评估。
Hum Genet. 2009 Mar;125(2):163-71. doi: 10.1007/s00439-008-0606-5. Epub 2008 Dec 17.
6
Practical issues in imputation-based association mapping.基于插补的关联映射中的实际问题。
PLoS Genet. 2008 Dec;4(12):e1000279. doi: 10.1371/journal.pgen.1000279. Epub 2008 Dec 5.
8
Comparing algorithms for genotype imputation.比较基因分型填补算法。
Am J Hum Genet. 2008 Oct;83(4):535-9; author reply 539-40. doi: 10.1016/j.ajhg.2008.09.007.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验