一种用于下一代全基因组关联研究的灵活且准确的基因型填充方法。

A flexible and accurate genotype imputation method for the next generation of genome-wide association studies.

作者信息

Howie Bryan N, Donnelly Peter, Marchini Jonathan

机构信息

Department of Statistics, University of Oxford, Oxford, UK.

出版信息

PLoS Genet. 2009 Jun;5(6):e1000529. doi: 10.1371/journal.pgen.1000529. Epub 2009 Jun 19.

DOI:10.1371/journal.pgen.1000529

PMID:19543373

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2689936/

Abstract

Genotype imputation methods are now being widely used in the analysis of genome-wide association studies. Most imputation analyses to date have used the HapMap as a reference dataset, but new reference panels (such as controls genotyped on multiple SNP chips and densely typed samples from the 1,000 Genomes Project) will soon allow a broader range of SNPs to be imputed with higher accuracy, thereby increasing power. We describe a genotype imputation method (IMPUTE version 2) that is designed to address the challenges presented by these new datasets. The main innovation of our approach is a flexible modelling framework that increases accuracy and combines information across multiple reference panels while remaining computationally feasible. We find that IMPUTE v2 attains higher accuracy than other methods when the HapMap provides the sole reference panel, but that the size of the panel constrains the improvements that can be made. We also find that imputation accuracy can be greatly enhanced by expanding the reference panel to contain thousands of chromosomes and that IMPUTE v2 outperforms other methods in this setting at both rare and common SNPs, with overall error rates that are 15%-20% lower than those of the closest competing method. One particularly challenging aspect of next-generation association studies is to integrate information across multiple reference panels genotyped on different sets of SNPs; we show that our approach to this problem has practical advantages over other suggested solutions.

摘要

基因型填充方法目前正广泛应用于全基因组关联研究分析中。迄今为止，大多数填充分析都使用HapMap作为参考数据集，但新的参考面板（如在多个SNP芯片上进行基因分型的对照样本以及来自千人基因组计划的高密度分型样本）将很快使得更广泛的SNP能够以更高的准确性被填充，从而提高检验效能。我们描述了一种基因型填充方法（IMPUTE版本2），该方法旨在应对这些新数据集带来的挑战。我们方法的主要创新之处在于一个灵活的建模框架，它提高了准确性，整合了多个参考面板的信息，同时在计算上仍然可行。我们发现，当HapMap作为唯一的参考面板时，IMPUTE v2比其他方法具有更高的准确性，但面板的大小限制了所能取得的改进。我们还发现，通过将参考面板扩展到包含数千条染色体，可以大大提高填充准确性，并且在这种情况下，IMPUTE v2在罕见和常见SNP方面均优于其他方法，总体错误率比最接近的竞争方法低15% - 20%。下一代关联研究中一个特别具有挑战性的方面是整合在不同SNP集上进行基因分型的多个参考面板的信息；我们表明，我们解决这个问题的方法相对于其他建议的解决方案具有实际优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/af3b/2689936/f622eed0af64/pgen.1000529.g001.jpg

相似文献

A flexible and accurate genotype imputation method for the next generation of genome-wide association studies.

PLoS Genet. 2009 Jun;5(6):e1000529. doi: 10.1371/journal.pgen.1000529. Epub 2009 Jun 19.

Comprehensive evaluation of imputation performance in African Americans.

J Hum Genet. 2012 Jul;57(7):411-21. doi: 10.1038/jhg.2012.43. Epub 2012 May 31.

Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies.

BMC Genet. 2009 Jun 16;10:27. doi: 10.1186/1471-2156-10-27.

Effect of genome-wide genotyping and reference panels on rare variants imputation.

J Genet Genomics. 2012 Oct 20;39(10):545-50. doi: 10.1016/j.jgg.2012.07.002. Epub 2012 Jul 24.

The effect of reference panels and software tools on genotype imputation.

AMIA Annu Symp Proc. 2011;2011:1013-8. Epub 2011 Oct 22.

Founder population-specific HapMap panel increases power in GWA studies through improved imputation accuracy and CNV tagging.

Genome Res. 2010 Oct;20(10):1344-51. doi: 10.1101/gr.106534.110. Epub 2010 Sep 1.

Rare variant genotype imputation with thousands of study-specific whole-genome sequences: implications for cost-effective study designs.

Eur J Hum Genet. 2015 Jul;23(7):975-83. doi: 10.1038/ejhg.2014.216. Epub 2014 Oct 8.

A comprehensive evaluation of SNP genotype imputation.

Hum Genet. 2009 Mar;125(2):163-71. doi: 10.1007/s00439-008-0606-5. Epub 2008 Dec 17.

Validation of genotype imputation in Southeast Asian populations and the effect of single nucleotide polymorphism annotation on imputation outcome.

BMC Med Genet. 2018 Feb 13;19(1):23. doi: 10.1186/s12881-018-0534-8.

A new strategy for enhancing imputation quality of rare variants from next-generation sequencing data via combining SNP and exome chip data.

BMC Genomics. 2015 Dec 29;16:1109. doi: 10.1186/s12864-015-2192-y.

引用本文的文献

Large-scale GWAS of strabismus identifies risk loci and provides support for a link with maternal smoking.

Nat Commun. 2025 Aug 23;16(1):7890. doi: 10.1038/s41467-025-62456-9.

Assessment of a microhaplotype panel for human identification and ancestry inference in Brazil.

Int J Legal Med. 2025 Aug 22. doi: 10.1007/s00414-025-03573-4.

Mediational effects of reading-related intermediate phenotypes from polygenic scores to reading skills.

NPJ Sci Learn. 2025 Aug 19;10(1):56. doi: 10.1038/s41539-025-00346-x.

DNA methylation of food sensitization in a French-Canadian population.

Clin Epigenetics. 2025 Aug 17;17(1):143. doi: 10.1186/s13148-025-01951-8.

Methylation profile of individuals with sickle cell trait.

Epigenetics. 2025 Dec;20(1):2539234. doi: 10.1080/15592294.2025.2539234. Epub 2025 Aug 4.

Genetic modulation of lncPSMB1 confers non-syndromic cleft lip with or without cleft palate susceptibility by promoting cell apoptosis.

Commun Biol. 2025 Jul 29;8(1):1123. doi: 10.1038/s42003-025-08563-1.

The genomic footprints of migration: how ancient DNA reveals our history of mobility.

Genome Biol. 2025 Jul 16;26(1):206. doi: 10.1186/s13059-025-03664-w.

Identification of functional non-coding variants associated with orofacial cleft.

Nat Commun. 2025 Jul 16;16(1):6545. doi: 10.1038/s41467-025-61734-w.

Genome-wide study links cardiometabolic factors to cognition via APOA4-APOA5-ZPR1-BUD13 and other loci in rural Indians.

Alzheimers Dement. 2025 Jul;21(7):e70429. doi: 10.1002/alz.70429.

Genome-wide association analysis revealed novel candidate genes for body measurement traits in indigenous Gudali and crossbred Simgud in Cameroon.

BMC Genomics. 2025 Jul 14;26(1):664. doi: 10.1186/s12864-025-11865-7.

本文引用的文献

Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes.

Nat Genet. 2009 Jun;41(6):703-7. doi: 10.1038/ng.381. Epub 2009 May 10.

Genotype-imputation accuracy across worldwide human populations.

Am J Hum Genet. 2009 Feb;84(2):235-50. doi: 10.1016/j.ajhg.2009.01.013.

A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals.

Am J Hum Genet. 2009 Feb;84(2):210-23. doi: 10.1016/j.ajhg.2009.01.005. Epub 2009 Feb 5.

Detection of sharing by descent, long-range phasing and haplotype imputation.

Nat Genet. 2008 Sep;40(9):1068-75. doi: 10.1038/ng.216.

Imputation of missing genotypes: an empirical evaluation of IMPUTE.

BMC Genet. 2008 Dec 12;9:85. doi: 10.1186/1471-2156-9-85.

Practical issues in imputation-based association mapping.

PLoS Genet. 2008 Dec;4(12):e1000279. doi: 10.1371/journal.pgen.1000279. Epub 2008 Dec 5.

Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease.

Nat Genet. 2008 Aug;40(8):955-62. doi: 10.1038/ng.175. Epub 2008 Jun 29.

Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes.

Nat Genet. 2008 May;40(5):638-45. doi: 10.1038/ng.120. Epub 2008 Mar 30.

Simple and efficient analysis of disease association with missing genotype data.

Am J Hum Genet. 2008 Feb;82(2):444-52. doi: 10.1016/j.ajhg.2007.11.004.

A second generation human haplotype map of over 3.1 million SNPs.

Nature. 2007 Oct 18;449(7164):851-61. doi: 10.1038/nature06258.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于下一代全基因组关联研究的灵活且准确的基因型填充方法。

A flexible and accurate genotype imputation method for the next generation of genome-wide association studies.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献