通过收集额外的单核苷酸多态性来提高全基因组关联研究的效力。

Increasing power of genome-wide association studies by collecting additional single-nucleotide polymorphisms.

机构信息

Department of Computer Science, University of California, Los Angeles, California 90095-1596, USA.

出版信息

Genetics. 2011 Jun;188(2):449-60. doi: 10.1534/genetics.111.128595. Epub 2011 Apr 5.

DOI:10.1534/genetics.111.128595

PMID:21467568

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3122306/

Abstract

Genome-wide association studies (GWASs) have been effectively identifying the genomic regions associated with a disease trait. In a typical GWAS, an informative subset of the single-nucleotide polymorphisms (SNPs), called tag SNPs, is genotyped in case/control individuals. Once the tag SNP statistics are computed, the genomic regions that are in linkage disequilibrium (LD) with the most significantly associated tag SNPs are believed to contain the causal polymorphisms. However, such LD regions are often large and contain many additional polymorphisms. Following up all the SNPs included in these regions is costly and infeasible for biological validation. In this article we address how to characterize these regions cost effectively with the goal of providing investigators a clear direction for biological validation. We introduce a follow-up study approach for identifying all untyped associated SNPs by selecting additional SNPs, called follow-up SNPs, from the associated regions and genotyping them in the original case/control individuals. We introduce a novel SNP selection method with the goal of maximizing the number of associated SNPs among the chosen follow-up SNPs. We show how the observed statistics of the original tag SNPs and human genetic variation reference data such as the HapMap Project can be utilized to identify the follow-up SNPs. We use simulated and real association studies based on the HapMap data and the Wellcome Trust Case Control Consortium to demonstrate that our method shows superior performance to the correlation- and distance-based traditional follow-up SNP selection approaches. Our method is publicly available at http://genetics.cs.ucla.edu/followupSNPs.

摘要

全基因组关联研究（GWAS）已成功确定与疾病特征相关的基因组区域。在典型的 GWAS 中，对病例/对照个体中的一组信息单核苷酸多态性（SNP），称为标签 SNP，进行基因分型。一旦计算出标签 SNP 统计数据，就认为与最显著相关的标签 SNP 处于连锁不平衡（LD）的基因组区域包含因果多态性。然而，这些 LD 区域通常很大，包含许多其他多态性。在这些区域中跟踪所有包含的 SNP 既昂贵又不适合生物验证。在本文中，我们将解决如何以经济有效的方式对这些区域进行特征描述，旨在为研究人员提供明确的生物验证方向。我们提出了一种后续研究方法，通过从关联区域中选择额外的 SNP（称为后续 SNP）并对原始病例/对照个体进行基因分型，从而有效地对这些区域进行后续研究，以识别所有未分型的关联 SNP。我们介绍了一种新的 SNP 选择方法，目的是在选择的后续 SNP 中最大化关联 SNP 的数量。我们展示了如何利用原始标签 SNP 的观察统计数据和人类遗传变异参考数据（如 HapMap 项目）来识别后续 SNP。我们使用基于 HapMap 数据和 Wellcome Trust Case Control Consortium 的模拟和真实关联研究来证明我们的方法优于基于相关性和距离的传统后续 SNP 选择方法。我们的方法可在 http://genetics.cs.ucla.edu/followupSNPs 上公开获取。

相似文献

Increasing power of genome-wide association studies by collecting additional single-nucleotide polymorphisms.通过收集额外的单核苷酸多态性来提高全基因组关联研究的效力。

Genetics. 2011 Jun;188(2):449-60. doi: 10.1534/genetics.111.128595. Epub 2011 Apr 5.

Efficient association study design via power-optimized tag SNP selection.通过功效优化标签单核苷酸多态性选择实现高效关联研究设计。

Ann Hum Genet. 2008 Nov;72(Pt 6):834-47. doi: 10.1111/j.1469-1809.2008.00469.x. Epub 2008 Aug 13.

Power-based, phase-informed selection of single nucleotide polymorphisms for disease association screens.基于功效、相位信息的单核苷酸多态性选择用于疾病关联筛查。

Genet Epidemiol. 2006 Sep;30(6):459-70. doi: 10.1002/gepi.20159.

Enhanced statistical tests for GWAS in admixed populations: assessment using African Americans from CARe and a Breast Cancer Consortium.混合人群 GWAS 的增强统计检验：使用来自 CARe 和乳腺癌联盟的非裔美国人进行评估。

PLoS Genet. 2011 Apr;7(4):e1001371. doi: 10.1371/journal.pgen.1001371. Epub 2011 Apr 21.

Genetic Basis of Common Human Disease: Insight into the Role of Missense SNPs from Genome-Wide Association Studies.常见人类疾病的遗传基础：全基因组关联研究对错义单核苷酸多态性作用的洞察

J Mol Biol. 2015 Jul 3;427(13):2271-89. doi: 10.1016/j.jmb.2015.04.014. Epub 2015 May 1.

ATRIUM: testing untyped SNPs in case-control association studies with related individuals.心房：在与相关个体的病例对照关联研究中测试无类型单核苷酸多态性。

Am J Hum Genet. 2009 Nov;85(5):667-78. doi: 10.1016/j.ajhg.2009.10.006.

Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies.未分型标记的全基因组推断准确性及其对关联研究统计效能的影响。

BMC Genet. 2009 Jun 16;10:27. doi: 10.1186/1471-2156-10-27.

Analysis of untyped SNPs: maximum likelihood and imputation methods.非分型单核苷酸多态性分析：最大似然法和推断方法。

Genet Epidemiol. 2010 Dec;34(8):803-15. doi: 10.1002/gepi.20527.

An efficient weighted tag SNP-set analytical method in genome-wide association studies.全基因组关联研究中的一种高效加权标签单核苷酸多态性集分析方法。

BMC Genet. 2015 Mar 13;16:25. doi: 10.1186/s12863-015-0182-3.

Tagging SNP-set selection with maximum information based on linkage disequilibrium structure in genome-wide association studies.基于全基因组关联研究中连锁不平衡结构的最大信息进行 SNP 集选择标记。

Bioinformatics. 2017 Jul 15;33(14):2078-2081. doi: 10.1093/bioinformatics/btx151.

引用本文的文献

A Unifying Framework for Imputing Summary Statistics in Genome-Wide Association Studies.用于全基因组关联研究中推断汇总统计数据的统一框架。

J Comput Biol. 2020 Mar;27(3):418-428. doi: 10.1089/cmb.2019.0449. Epub 2020 Feb 13.

A phenomics-based approach for the detection and interpretation of shared genetic influences on 29 biochemical indices in southern Chinese men.基于表型组学的方法，用于检测和解释中国南方男性 29 项生化指标的共享遗传影响。

BMC Genomics. 2019 Dec 16;20(1):983. doi: 10.1186/s12864-019-6363-0.

Improving Imputation Accuracy by Inferring Causal Variants in Genetic Studies.通过推断基因研究中的因果变异提高插补准确性。

J Comput Biol. 2019 Nov;26(11):1203-1213. doi: 10.1089/cmb.2018.0139. Epub 2018 Oct 1.

Widespread Allelic Heterogeneity in Complex Traits.复杂性状中的广泛等位基因异质性。

Am J Hum Genet. 2017 May 4;100(5):789-802. doi: 10.1016/j.ajhg.2017.04.005.

Enhanced methods to detect haplotypic effects on gene expression.检测单倍型对基因表达影响的改进方法。

Bioinformatics. 2017 Aug 1;33(15):2307-2313. doi: 10.1093/bioinformatics/btx142.

Colocalization of GWAS and eQTL Signals Detects Target Genes.全基因组关联研究（GWAS）与表达数量性状基因座（eQTL）信号的共定位可检测目标基因。

Am J Hum Genet. 2016 Dec 1;99(6):1245-1260. doi: 10.1016/j.ajhg.2016.10.003. Epub 2016 Nov 17.

Dissecting the genetics of complex traits using summary association statistics.利用汇总关联统计剖析复杂性状的遗传学。

Nat Rev Genet. 2017 Feb;18(2):117-127. doi: 10.1038/nrg.2016.142. Epub 2016 Nov 14.

Multiple testing correction in linear mixed models.线性混合模型中的多重检验校正

Genome Biol. 2016 Apr 1;17:62. doi: 10.1186/s13059-016-0903-6.

Gene-Gene Interactions Detection Using a Two-stage Model.使用两阶段模型检测基因-基因相互作用

J Comput Biol. 2015 Jun;22(6):563-76. doi: 10.1089/cmb.2014.0163. Epub 2015 Apr 14.

DISSCO: direct imputation of summary statistics allowing covariates.DISSCO：允许协变量的汇总统计量直接插补

Bioinformatics. 2015 Aug 1;31(15):2434-42. doi: 10.1093/bioinformatics/btv168. Epub 2015 Mar 24.

本文引用的文献

A map of human genome variation from population-scale sequencing.人类基因组变异的图谱来自于基于人群的测序。

Nature. 2010 Oct 28;467(7319):1061-73. doi: 10.1038/nature09534.

Multi-marker tagging single nucleotide polymorphism selection using estimation of distribution algorithms.使用分布估计算法进行多标记标记单核苷酸多态性选择。

Artif Intell Med. 2010 Nov;50(3):193-201. doi: 10.1016/j.artmed.2010.05.010. Epub 2010 Jul 21.

Potential etiologic and functional implications of genome-wide association loci for human diseases and traits.全基因组关联位点对人类疾病和性状的潜在病因学及功能影响。

Proc Natl Acad Sci U S A. 2009 Jun 9;106(23):9362-7. doi: 10.1073/pnas.0903103106. Epub 2009 May 27.

Rapid and accurate multiple testing correction and power estimation for millions of correlated markers.针对数百万个相关标记物进行快速准确的多重检验校正和效能估计。

PLoS Genet. 2009 Apr;5(4):e1000456. doi: 10.1371/journal.pgen.1000456. Epub 2009 Apr 17.

Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls.对14000例七种常见疾病患者及3000例共享对照进行全基因组关联研究。

Nature. 2007 Jun 7;447(7145):661-78. doi: 10.1038/nature05911.

Power-based, phase-informed selection of single nucleotide polymorphisms for disease association screens.基于功效、相位信息的单核苷酸多态性选择用于疾病关联筛查。

Genet Epidemiol. 2006 Sep;30(6):459-70. doi: 10.1002/gepi.20159.

Selection of SNP subsets for association studies in candidate genes: comparison of the power of different strategies to detect single disease susceptibility locus effects.候选基因关联研究中SNP子集的选择：不同策略检测单一疾病易感性位点效应效能的比较

BMC Genet. 2006 Apr 5;7:20. doi: 10.1186/1471-2156-7-20.

An efficient comprehensive search algorithm for tagSNP selection using linkage disequilibrium criteria.一种使用连锁不平衡标准进行标签单核苷酸多态性选择的高效综合搜索算法。

Bioinformatics. 2006 Jan 15;22(2):220-5. doi: 10.1093/bioinformatics/bti762. Epub 2005 Nov 3.

SNP selection for association studies: maximizing power across SNP choice and study size.用于关联研究的单核苷酸多态性选择：在单核苷酸多态性选择和研究规模上最大化效能

Ann Hum Genet. 2005 Nov;69(Pt 6):733-46. doi: 10.1111/j.1529-8817.2005.00202.x.

A haplotype map of the human genome.人类基因组单倍型图谱。

Nature. 2005 Oct 27;437(7063):1299-320. doi: 10.1038/nature04226.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。