基于插补的关联映射中的实际问题。

Practical issues in imputation-based association mapping.

作者信息

Guan Yongtao, Stephens Matthew

机构信息

Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America.

出版信息

PLoS Genet. 2008 Dec;4(12):e1000279. doi: 10.1371/journal.pgen.1000279. Epub 2008 Dec 5.

DOI:10.1371/journal.pgen.1000279

PMID:19057666

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2585794/

Abstract

Imputation-based association methods provide a powerful framework for testing untyped variants for association with phenotypes and for combining results from multiple studies that use different genotyping platforms. Here, we consider several issues that arise when applying these methods in practice, including: (i) factors affecting imputation accuracy, including choice of reference panel; (ii) the effects of imputation accuracy on power to detect associations; (iii) the relative merits of Bayesian and frequentist approaches to testing imputed genotypes for association with phenotype; and (iv) how to quickly and accurately compute Bayes factors for testing imputed SNPs. We find that imputation-based methods can be robust to imputation accuracy and can improve power to detect associations, even when average imputation accuracy is poor. We explain how ranking SNPs for association by a standard likelihood ratio test gives the same results as a Bayesian procedure that uses an unnatural prior assumption--specifically, that difficult-to-impute SNPs tend to have larger effects--and assess the power gained from using a Bayesian approach that does not make this assumption. Within the Bayesian framework, we find that good approximations to a full analysis can be achieved by simply replacing unknown genotypes with a point estimate--their posterior mean. This approximation considerably reduces computational expense compared with published sampling-based approaches, and the methods we present are practical on a genome-wide scale with very modest computational resources (e.g., a single desktop computer). The approximation also facilitates combining information across studies, using only summary data for each SNP. Methods discussed here are implemented in the software package BIMBAM, which is available from http://stephenslab.uchicago.edu/software.html.

摘要

基于插补的关联方法为检测未分型变异与表型之间的关联以及整合来自使用不同基因分型平台的多项研究结果提供了一个强大的框架。在此，我们考虑在实际应用这些方法时出现的几个问题，包括：（i）影响插补准确性的因素，包括参考面板的选择；（ii）插补准确性对检测关联效能的影响；（iii）贝叶斯方法和频率论方法在检测插补基因型与表型关联方面的相对优点；以及（iv）如何快速准确地计算用于检测插补单核苷酸多态性（SNP）的贝叶斯因子。我们发现基于插补的方法对插补准确性具有稳健性，并且即使平均插补准确性较差也能提高检测关联的效能。我们解释了通过标准似然比检验对SNP进行关联排序如何与使用不自然先验假设的贝叶斯程序得出相同的结果——具体而言，即难以插补的SNP往往具有更大的效应——并评估使用不做此假设的贝叶斯方法所获得的效能。在贝叶斯框架内，我们发现通过简单地用点估计——其后验均值——替换未知基因型，可以实现对完整分析的良好近似。与已发表的基于抽样的方法相比，这种近似大大降低了计算成本，并且我们提出的方法在全基因组规模上使用非常有限计算资源（例如，一台台式计算机）时是可行的。这种近似还便于仅使用每个SNP的汇总数据跨研究整合信息。这里讨论的方法在软件包BIMBAM中实现，可从http://stephenslab.uchicago.edu/software.html获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d6/2585794/c61c51ce7813/pgen.1000279.g001.jpg

相似文献

Practical issues in imputation-based association mapping.

PLoS Genet. 2008 Dec;4(12):e1000279. doi: 10.1371/journal.pgen.1000279. Epub 2008 Dec 5.

Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies.

BMC Genet. 2009 Jun 16;10:27. doi: 10.1186/1471-2156-10-27.

Imputation-based analysis of association studies: candidate regions and quantitative traits.

PLoS Genet. 2007 Jul;3(7):e114. doi: 10.1371/journal.pgen.0030114. Epub 2007 May 30.

Analysis of untyped SNPs: maximum likelihood and imputation methods.

Genet Epidemiol. 2010 Dec;34(8):803-15. doi: 10.1002/gepi.20527.

Imputation of missing genotypes: an empirical evaluation of IMPUTE.

BMC Genet. 2008 Dec 12;9:85. doi: 10.1186/1471-2156-9-85.

DISTMIX: direct imputation of summary statistics for unmeasured SNPs from mixed ethnicity cohorts.

Bioinformatics. 2015 Oct 1;31(19):3099-104. doi: 10.1093/bioinformatics/btv348. Epub 2015 Jun 9.

Fast and accurate imputation of summary statistics enhances evidence of functional enrichment.

Bioinformatics. 2014 Oct 15;30(20):2906-14. doi: 10.1093/bioinformatics/btu416. Epub 2014 Jul 1.

Evaluation of the accuracy of imputed sequence variant genotypes and their utility for causal variant detection in cattle.

Genet Sel Evol. 2017 Feb 21;49(1):24. doi: 10.1186/s12711-017-0301-x.

FAPI: Fast and accurate P-value Imputation for genome-wide association study.

Eur J Hum Genet. 2016 May;24(5):761-6. doi: 10.1038/ejhg.2015.190. Epub 2015 Aug 26.

GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies.

BMC Genomics. 2014 Jul 19;15:610. doi: 10.1186/1471-2164-15-610.

引用本文的文献

Functionally-informed fine-mapping identifies genetic variants linking increased CHD1L expression and HIV restriction in monocytes.

Sci Rep. 2025 Jan 17;15(1):2325. doi: 10.1038/s41598-024-84817-y.

The Bayesian lens and Bayesian blinkers.

Philos Trans A Math Phys Eng Sci. 2023 May 15;381(2247):20220144. doi: 10.1098/rsta.2022.0144. Epub 2023 Mar 27.

False positive findings during genome-wide association studies with imputation: influence of allele frequency and imputation accuracy.

Hum Mol Genet. 2021 Dec 17;31(1):146-155. doi: 10.1093/hmg/ddab203.

Impact of pre- and post-variant filtration strategies on imputation.

Sci Rep. 2021 Mar 18;11(1):6214. doi: 10.1038/s41598-021-85333-z.

Why are rare variants hard to impute? Coalescent models reveal theoretical limits in existing algorithms.

Genetics. 2021 Apr 15;217(4). doi: 10.1093/genetics/iyab011.

Testing and controlling for horizontal pleiotropy with probabilistic Mendelian randomization in transcriptome-wide association studies.

Nat Commun. 2020 Jul 31;11(1):3861. doi: 10.1038/s41467-020-17668-6.

8q24 genetic variation and comprehensive haplotypes altering familial risk of prostate cancer.

Nat Commun. 2020 Mar 23;11(1):1523. doi: 10.1038/s41467-020-15122-1.

Genetic Architecture of Gene Expression in European and African Americans: An eQTL Mapping Study in GENOA.

Am J Hum Genet. 2020 Apr 2;106(4):496-512. doi: 10.1016/j.ajhg.2020.03.002. Epub 2020 Mar 26.

Stochastic search and joint fine-mapping increases accuracy and identifies previously unreported associations in immune-mediated diseases.

Nat Commun. 2019 Jul 19;10(1):3216. doi: 10.1038/s41467-019-11271-0.

Gene hunting with hidden Markov model knockoffs.

Biometrika. 2019 Mar;106(1):1-18. doi: 10.1093/biomet/asy033. Epub 2018 Aug 4.

本文引用的文献

Bayes factors for genome-wide association studies: comparison with P-values.

Genet Epidemiol. 2009 Jan;33(1):79-86. doi: 10.1002/gepi.20359.

Polymorphisms of the HNF1A gene encoding hepatocyte nuclear factor-1 alpha are associated with C-reactive protein.

Am J Hum Genet. 2008 May;82(5):1193-201. doi: 10.1016/j.ajhg.2008.03.017. Epub 2008 Apr 24.

Simple and efficient analysis of disease association with missing genotype data.

Am J Hum Genet. 2008 Feb;82(2):444-52. doi: 10.1016/j.ajhg.2007.11.004.

A second generation human haplotype map of over 3.1 million SNPs.

Nature. 2007 Oct 18;449(7164):851-61. doi: 10.1038/nature06258.

Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering.

Am J Hum Genet. 2007 Nov;81(5):1084-97. doi: 10.1086/521987. Epub 2007 Sep 21.

Imputation-based analysis of association studies: candidate regions and quantitative traits.

PLoS Genet. 2007 Jul;3(7):e114. doi: 10.1371/journal.pgen.0030114. Epub 2007 May 30.

A Bayesian measure of the probability of false discovery in genetic epidemiology studies.

Am J Hum Genet. 2007 Aug;81(2):208-27. doi: 10.1086/519024. Epub 2007 Jul 3.

A new multipoint method for genome-wide association studies by imputation of genotypes.

Nat Genet. 2007 Jul;39(7):906-13. doi: 10.1038/ng2088. Epub 2007 Jun 17.

Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls.

Nature. 2007 Jun 7;447(7145):661-78. doi: 10.1038/nature05911.

A method to address differential bias in genotyping in large-scale association studies.

PLoS Genet. 2007 May 18;3(5):e74. doi: 10.1371/journal.pgen.0030074. Epub 2007 Apr 5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于插补的关联映射中的实际问题。

Practical issues in imputation-based association mapping.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献