扩展罕见变异测试策略：非编码序列和推断基因型分析。

Extending rare-variant testing strategies: analysis of noncoding sequence and imputed genotypes.

机构信息

Department of Biostatistics, University of Michigan, Ann Arbor, 48109, USA.

出版信息

Am J Hum Genet. 2010 Nov 12;87(5):604-17. doi: 10.1016/j.ajhg.2010.10.012.

DOI:10.1016/j.ajhg.2010.10.012

PMID:21070896

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2978957/

Abstract

Next Generation Sequencing Technology has revolutionized our ability to study the contribution of rare genetic variation to heritable traits. However, existing single-marker association tests are underpowered for detecting rare risk variants. A more powerful approach involves pooling methods that combine multiple rare variants from the same gene into a single test statistic. Proposed pooling methods can be limited because they generally assume high-quality genotypes derived from deep-coverage sequencing, which may not be available. In this paper, we consider an intuitive and computationally efficient pooling statistic, the cumulative minor-allele test (CMAT). We assess the performance of the CMAT and other pooling methods on datasets simulated with population genetic models to contain realistic levels of neutral variation. We consider study designs ranging from exon-only to whole-gene analyses that contain noncoding variants. For all study designs, the CMAT achieves power comparable to that of previously proposed methods. We then extend the CMAT to probabilistic genotypes and describe application to low-coverage sequencing and imputation data. We show that augmenting sequence data with imputed samples is a practical method for increasing the power of rare-variant studies. We also provide a method of controlling for confounding variables such as population stratification. Finally, we demonstrate that our method makes it possible to use external imputation templates to analyze rare variants imputed into existing GWAS datasets. As proof of principle, we performed a CMAT analysis of more than 8 million SNPs that we imputed into the GAIN psoriasis dataset by using haplotypes from the 1000 Genomes Project.

摘要

下一代测序技术极大地提高了我们研究稀有遗传变异对可遗传特征的贡献的能力。然而，现有的单标记关联测试在检测稀有风险变异方面能力不足。一种更强大的方法涉及到合并方法，即将来自同一基因的多个稀有变体合并为一个单一的测试统计量。提出的合并方法可能会受到限制，因为它们通常假设来自深度覆盖测序的高质量基因型，而这可能并不存在。在本文中，我们考虑了一种直观且计算效率高的合并统计量，累积少数等位基因测试（CMAT）。我们评估了 CMAT 和其他合并方法在模拟具有真实中性变异水平的群体遗传模型的数据集上的性能。我们考虑了从仅外显子到包含非编码变异的全基因分析的研究设计。对于所有的研究设计，CMAT 都能达到与先前提出的方法相当的功效。然后，我们将 CMAT 扩展到概率基因型，并描述了其在低覆盖测序和推断数据中的应用。我们表明，通过增加推断样本来扩充序列数据是提高稀有变异研究功效的一种实用方法。我们还提供了一种控制混杂变量（如群体分层）的方法。最后，我们证明了我们的方法可以使用外部推断模板来分析推断到现有 GWAS 数据集中的稀有变体。作为原理验证，我们对超过 800 万个 SNP 进行了 CMAT 分析，这些 SNP 是通过使用 1000 基因组计划中的单倍型来推断到 GAIN 银屑病数据集的。

相似文献

Extending rare-variant testing strategies: analysis of noncoding sequence and imputed genotypes.扩展罕见变异测试策略：非编码序列和推断基因型分析。

Am J Hum Genet. 2010 Nov 12;87(5):604-17. doi: 10.1016/j.ajhg.2010.10.012.

Reconsidering association testing methods using single-variant test statistics as alternatives to pooling tests for sequence data with rare variants.重新考虑使用单变量检验统计量作为合并检验的替代方法，用于具有罕见变异的序列数据的关联检验方法。

PLoS One. 2012;7(2):e30238. doi: 10.1371/journal.pone.0030238. Epub 2012 Feb 17.

Power of family-based association designs to detect rare variants in large pedigrees using imputed genotypes.基于家系的关联设计在使用基因型推断的大型家系中检测罕见变异的能力。

Genet Epidemiol. 2014 Jan;38(1):1-9. doi: 10.1002/gepi.21776. Epub 2013 Nov 15.

A novel genome-information content-based statistic for genome-wide association analysis designed for next-generation sequencing data.一种基于基因组信息含量的新型统计方法，用于针对下一代测序数据的全基因组关联分析。

J Comput Biol. 2012 Jun;19(6):731-44. doi: 10.1089/cmb.2012.0035. Epub 2012 May 31.

Evaluation of the accuracy of imputed sequence variant genotypes and their utility for causal variant detection in cattle.评估插补序列变异基因型的准确性及其在牛因果变异检测中的效用。

Genet Sel Evol. 2017 Feb 21;49(1):24. doi: 10.1186/s12711-017-0301-x.

Association studies with imputed variants using expectation-maximization likelihood-ratio tests.使用期望最大化似然比检验对推算变异进行关联研究。

PLoS One. 2014 Nov 10;9(11):e110679. doi: 10.1371/journal.pone.0110679. eCollection 2014.

Implication of next-generation sequencing on association studies.下一代测序对关联研究的影响。

BMC Genomics. 2011 Jun 17;12:322. doi: 10.1186/1471-2164-12-322.

GENOME-WIDE ASSOCIATION MAPPING AND RARE ALLELES: FROM POPULATION GENOMICS TO PERSONALIZED MEDICINE - Session Introduction.全基因组关联图谱与罕见等位基因：从群体基因组学到个性化医学——会议介绍

Pac Symp Biocomput. 2011:74-5. doi: 10.1142/9789814335058_0008.

Rare variant testing of imputed data: an analysis pipeline typified.推断数据的罕见变异检测：一种典型的分析流程

Hum Hered. 2014;78(3-4):164-78. doi: 10.1159/000368676. Epub 2014 Dec 10.

Design of association studies with pooled or un-pooled next-generation sequencing data.基于汇集或未汇集下一代测序数据的关联研究设计。

Genet Epidemiol. 2010 Jul;34(5):479-91. doi: 10.1002/gepi.20501.

引用本文的文献

Genetics in parkinson's disease: From better disease understanding to machine learning based precision medicine.帕金森病中的遗传学：从深入了解疾病到基于机器学习的精准医学。

Front Mol Med. 2022 Oct 3;2:933383. doi: 10.3389/fmmed.2022.933383. eCollection 2022.

Excalibur: A new ensemble method based on an optimal combination of aggregation tests for rare-variant association testing for sequencing data.Excalibur：一种新的基于聚合检验最优组合的测序数据罕见变异关联检验的集成方法。

PLoS Comput Biol. 2023 Sep 14;19(9):e1011488. doi: 10.1371/journal.pcbi.1011488. eCollection 2023 Sep.

Targeted resequencing showing novel common and rare genetic variants increases the risk of asthma in the Chinese Han population.靶向重测序显示新型常见和罕见遗传变异增加了汉族人群患哮喘的风险。

J Clin Lab Anal. 2021 Jun;35(6):e23813. doi: 10.1002/jcla.23813. Epub 2021 May 9.

Robust Rare-Variant Association Tests For Quantitative Traits in General Pedigrees.一般家系中数量性状的稳健罕见变异关联检验

Stat Biosci. 2018 Dec;10(3):491-505. doi: 10.1007/s12561-017-9197-9. Epub 2017 Jun 5.

Testing an optimally weighted combination of common and/or rare variants with multiple traits.同时检测多种表型的常见和/或罕见变异的最优加权组合。

PLoS One. 2018 Jul 26;13(7):e0201186. doi: 10.1371/journal.pone.0201186. eCollection 2018.

A non-threshold region-specific method for detecting rare variants in complex diseases.一种用于检测复杂疾病中罕见变异的非阈值区域特异性方法。

PLoS One. 2017 Nov 30;12(11):e0188566. doi: 10.1371/journal.pone.0188566. eCollection 2017.

Illustrating, Quantifying, and Correcting for Bias in Post-hoc Analysis of Gene-Based Rare Variant Tests of Association.基于基因的罕见变异关联检验事后分析中偏差的说明、量化与校正

Front Genet. 2017 Sep 14;8:117. doi: 10.3389/fgene.2017.00117. eCollection 2017.

Genome-wide joint analysis of single-nucleotide variant sets and gene expression for hypertension and related phenotypes.高血压及相关表型的单核苷酸变异集与基因表达的全基因组联合分析。

BMC Proc. 2016 Oct 18;10(Suppl 7):125-129. doi: 10.1186/s12919-016-0017-x. eCollection 2016.

PreCimp: Pre-collapsing imputation approach increases imputation accuracy of rare variants in terms of collapsed variables.PreCimp：预折叠插补方法在折叠变量方面提高了罕见变异的插补准确性。

Genet Epidemiol. 2017 Jan;41(1):41-50. doi: 10.1002/gepi.22020. Epub 2016 Nov 10.

A Nonparametric Regression Approach to Control for Population Stratification in Rare Variant Association Studies.一种用于控制罕见变异关联研究中群体分层的非参数回归方法。

Sci Rep. 2016 Nov 18;6:37444. doi: 10.1038/srep37444.

本文引用的文献

A map of human genome variation from population-scale sequencing.人类基因组变异的图谱来自于基于人群的测序。

Nature. 2010 Oct 28;467(7319):1061-73. doi: 10.1038/nature09534.

Pooled association tests for rare variants in exon-resequencing studies.外显子重测序研究中罕见变异的合并关联分析。

Am J Hum Genet. 2010 Jun 11;86(6):832-8. doi: 10.1016/j.ajhg.2010.04.005. Epub 2010 May 13.

Rare variants create synthetic genome-wide associations.罕见变异导致全基因组关联合成。

PLoS Biol. 2010 Jan 26;8(1):e1000294. doi: 10.1371/journal.pbio.1000294.

Human genetic variation recognizes functional elements in noncoding sequence.人类遗传变异识别非编码序列中的功能元件。

Genome Res. 2010 Mar;20(3):311-9. doi: 10.1101/gr.094151.109. Epub 2009 Dec 23.

Sequencing technologies - the next generation.测序技术——下一代。

Nat Rev Genet. 2010 Jan;11(1):31-46. doi: 10.1038/nrg2626. Epub 2009 Dec 8.

Sequencing the IL4 locus in African Americans implicates rare noncoding variants in asthma susceptibility.对非裔美国人的 IL4 基因座进行测序提示罕见的非编码变异与哮喘易感性相关。

J Allergy Clin Immunol. 2009 Dec;124(6):1204-9.e9. doi: 10.1016/j.jaci.2009.09.013.

The relationship between imputation error and statistical power in genetic association studies in diverse populations.不同人群基因关联研究中插补误差与统计功效之间的关系。

Am J Hum Genet. 2009 Nov;85(5):692-8. doi: 10.1016/j.ajhg.2009.09.017. Epub 2009 Oct 22.

Finding the missing heritability of complex diseases.寻找复杂疾病中缺失的遗传力。

Nature. 2009 Oct 8;461(7265):747-53. doi: 10.1038/nature08494.

Genotype imputation.基因型推算

Annu Rev Genomics Hum Genet. 2009;10:387-406. doi: 10.1146/annurev.genom.9.081307.164242.

A functional haplotype variant in the TBX22 promoter is associated with cleft palate and ankyloglossia.TBX22启动子中的一个功能性单倍型变异与腭裂和舌系带过短有关。

J Med Genet. 2009 Aug;46(8):555-61. doi: 10.1136/jmg.2009.066902.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。