Suppr超能文献

使用 1000 基因组计划在非裔美国人研究中评估基因型推断性能。

Assessment of genotype imputation performance using 1000 Genomes in African American studies.

机构信息

Behavioral Health Epidemiology Program, Research Triangle Institute International, Research Triangle Park, North Carolina, United States of America.

出版信息

PLoS One. 2012;7(11):e50610. doi: 10.1371/journal.pone.0050610. Epub 2012 Nov 30.

Abstract

Genotype imputation, used in genome-wide association studies to expand coverage of single nucleotide polymorphisms (SNPs), has performed poorly in African Americans compared to less admixed populations. Overall, imputation has typically relied on HapMap reference haplotype panels from Africans (YRI), European Americans (CEU), and Asians (CHB/JPT). The 1000 Genomes project offers a wider range of reference populations, such as African Americans (ASW), but their imputation performance has had limited evaluation. Using 595 African Americans genotyped on Illumina's HumanHap550v3 BeadChip, we compared imputation results from four software programs (IMPUTE2, BEAGLE, MaCH, and MaCH-Admix) and three reference panels consisting of different combinations of 1000 Genomes populations (February 2012 release): (1) 3 specifically selected populations (YRI, CEU, and ASW); (2) 8 populations of diverse African (AFR) or European (AFR) descent; and (3) all 14 available populations (ALL). Based on chromosome 22, we calculated three performance metrics: (1) concordance (percentage of masked genotyped SNPs with imputed and true genotype agreement); (2) imputation quality score (IQS; concordance adjusted for chance agreement, which is particularly informative for low minor allele frequency [MAF] SNPs); and (3) average r2hat (estimated correlation between the imputed and true genotypes, for all imputed SNPs). Across the reference panels, IMPUTE2 and MaCH had the highest concordance (91%-93%), but IMPUTE2 had the highest IQS (81%-83%) and average r2hat (0.68 using YRI+ASW+CEU, 0.62 using AFR+EUR, and 0.55 using ALL). Imputation quality for most programs was reduced by the addition of more distantly related reference populations, due entirely to the introduction of low frequency SNPs (MAF≤2%) that are monomorphic in the more closely related panels. While imputation was optimized by using IMPUTE2 with reference to the ALL panel (average r2hat = 0.86 for SNPs with MAF>2%), use of the ALL panel for African American studies requires careful interpretation of the population specificity and imputation quality of low frequency SNPs.

摘要

基因型推断在全基因组关联研究中用于扩展单核苷酸多态性(SNP)的覆盖范围,但与混合程度较低的人群相比,在非裔美国人中表现不佳。总体而言,推断通常依赖于来自非洲人(YRI)、欧洲裔美国人(CEU)和亚洲人(CHB/JPT)的 HapMap 参考单倍型面板。1000 基因组计划提供了更广泛的参考人群,例如非裔美国人(ASW),但对其推断性能的评估有限。使用 Illumina 的 HumanHap550v3 BeadChip 对 595 名非裔美国人进行基因分型,我们比较了四个软件程序(IMPUTE2、BEAGLE、MaCH 和 MaCH-Admix)和三个参考面板的推断结果,这些参考面板由不同的 1000 基因组人群组合组成(2012 年 2 月发布):(1)3 个专门选择的人群(YRI、CEU 和 ASW);(2)8 个具有不同非洲(AFR)或欧洲(AFR)血统的人群;(3)所有 14 个可用人群(ALL)。基于染色体 22,我们计算了三个性能指标:(1)一致性(与具有推断和真实基因型一致性的掩蔽基因型 SNP 的百分比);(2)推断质量评分(IQS;一致性调整为偶然一致性,对于低次要等位基因频率[MAF]SNP 特别有用);(3)平均 r2hat(所有推断 SNP 的推断和真实基因型之间的估计相关性)。在参考面板中,IMPUTE2 和 MaCH 具有最高的一致性(91%-93%),但 IMPUTE2 具有最高的 IQS(81%-83%)和平均 r2hat(使用 YRI+ASW+CEU 为 0.68,使用 AFR+EUR 为 0.62,使用 ALL 为 0.55)。由于更远处参考人群的引入,大多数程序的推断质量都降低了,这完全是由于引入了在更密切相关的面板中为单态性的低频 SNP(MAF≤2%)。虽然通过使用 ALL 面板对 IMPUTE2 进行参考优化了推断(MAF>2%的 SNP 的平均 r2hat=0.86),但在非裔美国人研究中使用 ALL 面板需要仔细解释低频 SNP 的种群特异性和推断质量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/235a/3511547/d752662bd484/pone.0050610.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验