使用 1000 基因组计划在非裔美国人研究中评估基因型推断性能。

Assessment of genotype imputation performance using 1000 Genomes in African American studies.

机构信息

Behavioral Health Epidemiology Program, Research Triangle Institute International, Research Triangle Park, North Carolina, United States of America.

出版信息

PLoS One. 2012;7(11):e50610. doi: 10.1371/journal.pone.0050610. Epub 2012 Nov 30.

DOI:10.1371/journal.pone.0050610

PMID:23226329

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3511547/

Abstract

Genotype imputation, used in genome-wide association studies to expand coverage of single nucleotide polymorphisms (SNPs), has performed poorly in African Americans compared to less admixed populations. Overall, imputation has typically relied on HapMap reference haplotype panels from Africans (YRI), European Americans (CEU), and Asians (CHB/JPT). The 1000 Genomes project offers a wider range of reference populations, such as African Americans (ASW), but their imputation performance has had limited evaluation. Using 595 African Americans genotyped on Illumina's HumanHap550v3 BeadChip, we compared imputation results from four software programs (IMPUTE2, BEAGLE, MaCH, and MaCH-Admix) and three reference panels consisting of different combinations of 1000 Genomes populations (February 2012 release): (1) 3 specifically selected populations (YRI, CEU, and ASW); (2) 8 populations of diverse African (AFR) or European (AFR) descent; and (3) all 14 available populations (ALL). Based on chromosome 22, we calculated three performance metrics: (1) concordance (percentage of masked genotyped SNPs with imputed and true genotype agreement); (2) imputation quality score (IQS; concordance adjusted for chance agreement, which is particularly informative for low minor allele frequency [MAF] SNPs); and (3) average r2hat (estimated correlation between the imputed and true genotypes, for all imputed SNPs). Across the reference panels, IMPUTE2 and MaCH had the highest concordance (91%-93%), but IMPUTE2 had the highest IQS (81%-83%) and average r2hat (0.68 using YRI+ASW+CEU, 0.62 using AFR+EUR, and 0.55 using ALL). Imputation quality for most programs was reduced by the addition of more distantly related reference populations, due entirely to the introduction of low frequency SNPs (MAF≤2%) that are monomorphic in the more closely related panels. While imputation was optimized by using IMPUTE2 with reference to the ALL panel (average r2hat = 0.86 for SNPs with MAF>2%), use of the ALL panel for African American studies requires careful interpretation of the population specificity and imputation quality of low frequency SNPs.

摘要

基因型推断在全基因组关联研究中用于扩展单核苷酸多态性（SNP）的覆盖范围，但与混合程度较低的人群相比，在非裔美国人中表现不佳。总体而言，推断通常依赖于来自非洲人（YRI）、欧洲裔美国人（CEU）和亚洲人（CHB/JPT）的 HapMap 参考单倍型面板。1000 基因组计划提供了更广泛的参考人群，例如非裔美国人（ASW），但对其推断性能的评估有限。使用 Illumina 的 HumanHap550v3 BeadChip 对 595 名非裔美国人进行基因分型，我们比较了四个软件程序（IMPUTE2、BEAGLE、MaCH 和 MaCH-Admix）和三个参考面板的推断结果，这些参考面板由不同的 1000 基因组人群组合组成（2012 年 2 月发布）：（1）3 个专门选择的人群（YRI、CEU 和 ASW）；（2）8 个具有不同非洲（AFR）或欧洲（AFR）血统的人群；（3）所有 14 个可用人群（ALL）。基于染色体 22，我们计算了三个性能指标：（1）一致性（与具有推断和真实基因型一致性的掩蔽基因型 SNP 的百分比）；（2）推断质量评分（IQS；一致性调整为偶然一致性，对于低次要等位基因频率[MAF]SNP 特别有用）；（3）平均 r2hat（所有推断 SNP 的推断和真实基因型之间的估计相关性）。在参考面板中，IMPUTE2 和 MaCH 具有最高的一致性（91%-93%），但 IMPUTE2 具有最高的 IQS（81%-83%）和平均 r2hat（使用 YRI+ASW+CEU 为 0.68，使用 AFR+EUR 为 0.62，使用 ALL 为 0.55）。由于更远处参考人群的引入，大多数程序的推断质量都降低了，这完全是由于引入了在更密切相关的面板中为单态性的低频 SNP（MAF≤2%）。虽然通过使用 ALL 面板对 IMPUTE2 进行参考优化了推断（MAF>2%的 SNP 的平均 r2hat=0.86），但在非裔美国人研究中使用 ALL 面板需要仔细解释低频 SNP 的种群特异性和推断质量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/235a/3511547/d752662bd484/pone.0050610.g001.jpg

相似文献

Assessment of genotype imputation performance using 1000 Genomes in African American studies.

PLoS One. 2012;7(11):e50610. doi: 10.1371/journal.pone.0050610. Epub 2012 Nov 30.

Comprehensive evaluation of imputation performance in African Americans.

J Hum Genet. 2012 Jul;57(7):411-21. doi: 10.1038/jhg.2012.43. Epub 2012 May 31.

Genotype imputation for African Americans using data from HapMap phase II versus 1000 genomes projects.

Genet Epidemiol. 2012 Jul;36(5):508-16. doi: 10.1002/gepi.21647. Epub 2012 May 29.

Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies.

BMC Genet. 2009 Jun 16;10:27. doi: 10.1186/1471-2156-10-27.

Genotype imputation performance of three reference panels using African ancestry individuals.

Hum Genet. 2018 Apr;137(4):281-292. doi: 10.1007/s00439-018-1881-4. Epub 2018 Apr 10.

Use of >100,000 NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations.

PLoS Genet. 2019 Dec 23;15(12):e1008500. doi: 10.1371/journal.pgen.1008500. eCollection 2019 Dec.

Practical considerations for imputation of untyped markers in admixed populations.

Genet Epidemiol. 2010 Apr;34(3):258-65. doi: 10.1002/gepi.20457.

Comprehensive Assessment of Genotype Imputation Performance.

Hum Hered. 2018;83(3):107-116. doi: 10.1159/000489758. Epub 2019 Jan 22.

Effect of genome-wide genotyping and reference panels on rare variants imputation.

J Genet Genomics. 2012 Oct 20;39(10):545-50. doi: 10.1016/j.jgg.2012.07.002. Epub 2012 Jul 24.

Genotype imputation of Metabochip SNPs using a study-specific reference panel of ~4,000 haplotypes in African Americans from the Women's Health Initiative.

Genet Epidemiol. 2012 Feb;36(2):107-17. doi: 10.1002/gepi.21603.

引用本文的文献

Recovering High-Quality Host Genomes from Gut Metagenomic Data through Genotype Imputation.

Adv Genet (Hoboken). 2022 May 6;3(3):2100065. doi: 10.1002/ggn2.202100065. eCollection 2022 Sep.

A data harmonization pipeline to leverage external controls and boost power in GWAS.

Hum Mol Genet. 2022 Feb 3;31(3):481-489. doi: 10.1093/hmg/ddab261.

Genetic Risk Stratification: A Paradigm Shift in Prevention of Coronary Artery Disease.

JACC Basic Transl Sci. 2021 Mar 22;6(3):287-304. doi: 10.1016/j.jacbts.2020.09.004. eCollection 2021 Mar.

The Global Durum Wheat Panel (GDP): An International Platform to Identify and Exchange Beneficial Alleles.

Front Plant Sci. 2020 Dec 21;11:569905. doi: 10.3389/fpls.2020.569905. eCollection 2020.

A large-scale genome-wide association study meta-analysis of cannabis use disorder.

Lancet Psychiatry. 2020 Dec;7(12):1032-1045. doi: 10.1016/S2215-0366(20)30339-4. Epub 2020 Oct 20.

Genome wide association analysis on semen volume and milk yield using different strategies of imputation to whole genome sequence in French dairy goats.

BMC Genet. 2020 Feb 21;21(1):19. doi: 10.1186/s12863-020-0826-9.

A multi-breed reference panel and additional rare variants maximize imputation accuracy in cattle.

Genet Sel Evol. 2019 Dec 26;51(1):77. doi: 10.1186/s12711-019-0519-x.

Rare Variants Imputation in Admixed Populations: Comparison Across Reference Panels and Bioinformatics Tools.

Front Genet. 2019 Apr 3;10:239. doi: 10.3389/fgene.2019.00239. eCollection 2019.

Protocols, Methods, and Tools for Genome-Wide Association Studies (GWAS) of Dental Traits.

Methods Mol Biol. 2019;1922:493-509. doi: 10.1007/978-1-4939-9012-2_38.

Evaluating the Accuracy of Imputation Methods in a Five-Way Admixed Population.

Front Genet. 2019 Feb 5;10:34. doi: 10.3389/fgene.2019.00034. eCollection 2019.

本文引用的文献

Comprehensive evaluation of imputation performance in African Americans.

J Hum Genet. 2012 Jul;57(7):411-21. doi: 10.1038/jhg.2012.43. Epub 2012 May 31.

Genotype imputation for African Americans using data from HapMap phase II versus 1000 genomes projects.

Genet Epidemiol. 2012 Jul;36(5):508-16. doi: 10.1002/gepi.21647. Epub 2012 May 29.

Genotype imputation with thousands of genomes.

G3 (Bethesda). 2011 Nov;1(6):457-70. doi: 10.1534/g3.111.001198. Epub 2011 Nov 1.

Performance of genotype imputations using data from the 1000 Genomes Project.

Hum Hered. 2012;73(1):18-25. doi: 10.1159/000334084. Epub 2011 Dec 30.

The effect of reference panels and software tools on genotype imputation.

AMIA Annu Symp Proc. 2011;2011:1013-8. Epub 2011 Oct 22.

Haplotype variation and genotype imputation in African populations.

Genet Epidemiol. 2011 Dec;35(8):766-80. doi: 10.1002/gepi.20626.

Meta-analysis of genome-wide association studies of asthma in ethnically diverse North American populations.

Nat Genet. 2011 Jul 31;43(9):887-92. doi: 10.1038/ng.888.

Enhanced statistical tests for GWAS in admixed populations: assessment using African Americans from CARe and a Breast Cancer Consortium.

PLoS Genet. 2011 Apr;7(4):e1001371. doi: 10.1371/journal.pgen.1001371. Epub 2011 Apr 21.

Imputation of low-frequency variants using the HapMap3 benefits from large, diverse reference sets.

Eur J Hum Genet. 2011 Jun;19(6):662-6. doi: 10.1038/ejhg.2011.10. Epub 2011 Mar 2.

Genome-wide association study of coronary heart disease and its risk factors in 8,090 African Americans: the NHLBI CARe Project.

PLoS Genet. 2011 Feb 10;7(2):e1001300. doi: 10.1371/journal.pgen.1001300.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用 1000 基因组计划在非裔美国人研究中评估基因型推断性能。

Assessment of genotype imputation performance using 1000 Genomes in African American studies.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献