非分型单核苷酸多态性分析：最大似然法和推断方法。

Analysis of untyped SNPs: maximum likelihood and imputation methods.

机构信息

Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina 27599-7420, USA.

出版信息

Genet Epidemiol. 2010 Dec;34(8):803-15. doi: 10.1002/gepi.20527.

DOI:10.1002/gepi.20527

PMID:21104886

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3030127/

Abstract

Analysis of untyped single nucleotide polymorphisms (SNPs) can facilitate the localization of disease-causing variants and permit meta-analysis of association studies with different genotyping platforms. We present two approaches for using the linkage disequilibrium structure of an external reference panel to infer the unknown value of an untyped SNP from the observed genotypes of typed SNPs. The maximum-likelihood approach integrates the prediction of untyped genotypes and estimation of association parameters into a single framework and yields consistent and efficient estimators of genetic effects and gene-environment interactions with proper variance estimators. The imputation approach is a two-stage strategy, which first imputes the untyped genotypes by either the most likely genotypes or the expected genotype counts and then uses the imputed values in a downstream association analysis. The latter approach has proper control of type I error in single-SNP tests with possible covariate adjustments even when the reference panel is misspecified; however, type I error may not be properly controlled in testing multiple-SNP effects or gene-environment interactions. In general, imputation yields biased estimators of genetic effects and gene-environment interactions, and the variances are underestimated. We conduct extensive simulation studies to compare the bias, type I error, power, and confidence interval coverage between the maximum likelihood and imputation approaches in the analysis of single-SNP effects, multiple-SNP effects, and gene-environment interactions under cross-sectional and case-control designs. In addition, we provide an illustration with genome-wide data from the Wellcome Trust Case-Control Consortium (WTCCC) [2007].

摘要

分析未分型的单核苷酸多态性 (SNPs) 可以帮助定位致病变异，并允许使用不同的基因分型平台进行关联研究的荟萃分析。我们提出了两种方法，利用外部参考面板的连锁不平衡结构，从已分型 SNPs 的观察基因型推断未分型 SNPs 的未知值。最大似然法将未分型基因型的预测和关联参数的估计整合到一个单一的框架中，并通过适当的方差估计量，为遗传效应和基因-环境相互作用提供一致和有效的估计量。推断方法是一种两阶段策略，首先通过最可能的基因型或预期的基因型计数来推断未分型的基因型，然后在下游关联分析中使用推断值。后一种方法在单 SNP 检验中具有适当的 I 型错误控制，即使参考面板指定不当，也可以进行可能的协变量调整；然而，在检验多 SNP 效应或基因-环境相互作用时，I 型错误可能无法得到适当控制。一般来说，推断会产生遗传效应和基因-环境相互作用的有偏估计量，并且方差被低估。我们进行了广泛的模拟研究，比较了最大似然法和推断法在分析单 SNP 效应、多 SNP 效应和基因-环境相互作用时的偏差、I 型错误、功效和置信区间覆盖范围，包括在横截面和病例对照设计下。此外，我们还提供了一个基于 Wellcome Trust Case-Control Consortium (WTCCC) [2007] 全基因组数据的实例。

相似文献

Analysis of untyped SNPs: maximum likelihood and imputation methods.

Genet Epidemiol. 2010 Dec;34(8):803-15. doi: 10.1002/gepi.20527.

ATRIUM: testing untyped SNPs in case-control association studies with related individuals.

Am J Hum Genet. 2009 Nov;85(5):667-78. doi: 10.1016/j.ajhg.2009.10.006.

Fast and robust association tests for untyped SNPs in case-control studies.

Hum Hered. 2010;70(3):167-76. doi: 10.1159/000308456. Epub 2010 Jul 30.

Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies.

BMC Genet. 2009 Jun 16;10:27. doi: 10.1186/1471-2156-10-27.

Using imputed genotype data in the joint score tests for genetic association and gene-environment interactions in case-control studies.

Genet Epidemiol. 2018 Mar;42(2):146-155. doi: 10.1002/gepi.22093. Epub 2017 Nov 26.

Jackknife-based gene-gene interactiontests for untyped SNPs.

BMC Genet. 2015 Jul 18;16:85. doi: 10.1186/s12863-015-0225-9.

Genome-wide association of breast cancer: composite likelihood with imputed genotypes.

Eur J Hum Genet. 2011 Feb;19(2):194-9. doi: 10.1038/ejhg.2010.157. Epub 2010 Oct 20.

1000 Genomes-based imputation identifies novel and refined associations for the Wellcome Trust Case Control Consortium phase 1 Data.

Eur J Hum Genet. 2012 Jul;20(7):801-5. doi: 10.1038/ejhg.2012.3. Epub 2012 Feb 1.

Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies.

PLoS Genet. 2008 Jul 25;4(7):e1000130. doi: 10.1371/journal.pgen.1000130.

Increasing power of genome-wide association studies by collecting additional single-nucleotide polymorphisms.

Genetics. 2011 Jun;188(2):449-60. doi: 10.1534/genetics.111.128595. Epub 2011 Apr 5.

引用本文的文献

Genetic signature of differentiated thyroid carcinoma susceptibility: a machine learning approach.

Eur Thyroid J. 2022 Sep 12;11(5). doi: 10.1530/ETJ-22-0058. Print 2022 Oct 1.

A new approach to testing mediation of the microbiome at both the community and individual taxon levels.

Bioinformatics. 2022 Jun 13;38(12):3173-3180. doi: 10.1093/bioinformatics/btac310.

A likelihood-based approach to transcriptome association analysis.

Stat Med. 2019 Apr 15;38(8):1357-1373. doi: 10.1002/sim.8040. Epub 2018 Dec 4.

Comparison of three boosting methods in parent-offspring trios for genotype imputation using simulation study.

J Anim Sci Technol. 2016 Jan 6;58:1. doi: 10.1186/s40781-015-0081-1. eCollection 2016.

A Likelihood-Based Framework for Association Analysis of Allele-Specific Copy Numbers.

J Am Stat Assoc. 2014 Oct;109(508):1533-1545. doi: 10.1080/01621459.2014.908777.

Integrative analysis of sequencing and array genotype data for discovering disease associations with rare mutations.

Proc Natl Acad Sci U S A. 2015 Jan 27;112(4):1019-24. doi: 10.1073/pnas.1406143112. Epub 2015 Jan 12.

Optimal methods for using posterior probabilities in association testing.

Hum Hered. 2013;75(1):2-11. doi: 10.1159/000349974. Epub 2013 Mar 27.

A genome-wide scan for breast cancer risk haplotypes among African American women.

PLoS One. 2013;8(2):e57298. doi: 10.1371/journal.pone.0057298. Epub 2013 Feb 28.

A mega-analysis of genome-wide association studies for major depressive disorder.

Mol Psychiatry. 2013 Apr;18(4):497-511. doi: 10.1038/mp.2012.21. Epub 2012 Apr 3.

本文引用的文献

Simple and efficient analysis of disease association with missing genotype data.

Am J Hum Genet. 2008 Feb;82(2):444-52. doi: 10.1016/j.ajhg.2007.11.004.

Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering.

Am J Hum Genet. 2007 Nov;81(5):1084-97. doi: 10.1086/521987. Epub 2007 Sep 21.

New models of collaboration in genome-wide association studies: the Genetic Association Information Network.

Nat Genet. 2007 Sep;39(9):1045-51. doi: 10.1038/ng2127.

A new multipoint method for genome-wide association studies by imputation of genotypes.

Nat Genet. 2007 Jul;39(7):906-13. doi: 10.1038/ng2088. Epub 2007 Jun 17.

Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls.

Nature. 2007 Jun 7;447(7145):661-78. doi: 10.1038/nature05911.

Leveraging the HapMap correlation structure in association studies.

Am J Hum Genet. 2007 Apr;80(4):683-91. doi: 10.1086/513109. Epub 2007 Mar 2.

Testing untyped alleles (TUNA)-applications to genome-wide association studies.

Genet Epidemiol. 2006 Dec;30(8):718-27. doi: 10.1002/gepi.20182.

A haplotype map of the human genome.

Nature. 2005 Oct 27;437(7063):1299-320. doi: 10.1038/nature04226.

Efficiency and power in genetic association studies.

Nat Genet. 2005 Nov;37(11):1217-23. doi: 10.1038/ng1669. Epub 2005 Oct 23.

Asymptotic equivalence between two score tests for haplotype-specific risk in general linear models.

Genet Epidemiol. 2005 Sep;29(2):166-70. doi: 10.1002/gepi.20087.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr
超能文献

非分型单核苷酸多态性分析：最大似然法和推断方法。

Analysis of untyped SNPs: maximum likelihood and imputation methods.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr超能文献

非分型单核苷酸多态性分析：最大似然法和推断方法。

Analysis of untyped SNPs: maximum likelihood and imputation methods.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr
超能文献