Suppr超能文献

下一代测序的关联研究。

Association studies for next-generation sequencing.

机构信息

Human Genetics Center, University of Texas School of Public Health, Houston, TX 77030, USA.

出版信息

Genome Res. 2011 Jul;21(7):1099-108. doi: 10.1101/gr.115998.110. Epub 2011 Apr 26.

Abstract

Genome-wide association studies (GWAS) have become the primary approach for identifying genes with common variants influencing complex diseases. Despite considerable progress, the common variations identified by GWAS account for only a small fraction of disease heritability and are unlikely to explain the majority of phenotypic variations of common diseases. A potential source of the missing heritability is the contribution of rare variants. Next-generation sequencing technologies will detect millions of novel rare variants, but these technologies have three defining features: identification of a large number of rare variants, a high proportion of sequence errors, and a large proportion of missing data. These features raise challenges for testing the association of rare variants with phenotypes of interest. In this study, we use a genome continuum model and functional principal components as a general principle for developing novel and powerful association analysis methods designed for resequencing data. We use simulations to calculate the type I error rates and the power of nine alternative statistics: two functional principal component analysis (FPCA)-based statistics, the multivariate principal component analysis (MPCA)-based statistic, the weighted sum (WSS), the variable-threshold (VT) method, the generalized T(2), the collapsing method, the CMC method, and individual tests. We also examined the impact of sequence errors on their type I error rates. Finally, we apply the nine statistics to the published resequencing data set from ANGPTL4 in the Dallas Heart Study. We report that FPCA-based statistics have a higher power to detect association of rare variants and a stronger ability to filter sequence errors than the other seven methods.

摘要

全基因组关联研究(GWAS)已成为鉴定常见变异影响复杂疾病的基因的主要方法。尽管取得了相当大的进展,但 GWAS 鉴定的常见变异仅占疾病遗传率的一小部分,不太可能解释常见疾病表型变异的大多数。遗传缺失的一个潜在来源是稀有变异的贡献。下一代测序技术将检测到数百万种新的稀有变异,但这些技术具有三个定义特征:大量稀有变异的识别、高比例的序列错误和大量缺失数据。这些特征为测试稀有变异与感兴趣的表型之间的关联带来了挑战。在这项研究中,我们使用基因组连续体模型和功能主成分作为开发针对重测序数据的新型强大关联分析方法的一般原则。我们使用模拟来计算 9 种替代统计量的Ⅰ型错误率和功效:两种基于功能主成分分析(FPCA)的统计量、基于多元主成分分析(MPCA)的统计量、加权和(WSS)、变量阈值(VT)方法、广义 T(2)、合并方法、CMC 方法和个体检验。我们还研究了序列错误对其Ⅰ型错误率的影响。最后,我们将这 9 种统计方法应用于达拉斯心脏研究中发表的 ANGPTL4 重测序数据集。我们报告称,基于 FPCA 的统计量在检测稀有变异的关联方面具有更高的功效,并且比其他七种方法具有更强的过滤序列错误的能力。

相似文献

1
Association studies for next-generation sequencing.下一代测序的关联研究。
Genome Res. 2011 Jul;21(7):1099-108. doi: 10.1101/gr.115998.110. Epub 2011 Apr 26.
3
Family-based association studies for next-generation sequencing.基于家系的下一代测序关联研究。
Am J Hum Genet. 2012 Jun 8;90(6):1028-45. doi: 10.1016/j.ajhg.2012.04.022.

引用本文的文献

1
An overview of recent technological developments in bovine genomics.牛基因组学近期技术发展概述。
Vet Anim Sci. 2024 Jul 23;25:100382. doi: 10.1016/j.vas.2024.100382. eCollection 2024 Sep.
5
Protein Sequencing, One Molecule at a Time.逐个分子进行蛋白质测序。
Annu Rev Biophys. 2022 May 9;51:181-200. doi: 10.1146/annurev-biophys-102121-103615. Epub 2022 Jan 5.

本文引用的文献

3
Pooled association tests for rare variants in exon-resequencing studies.外显子重测序研究中罕见变异的合并关联分析。
Am J Hum Genet. 2010 Jun 11;86(6):832-8. doi: 10.1016/j.ajhg.2010.04.005. Epub 2010 May 13.
5
Rare variants create synthetic genome-wide associations.罕见变异导致全基因组关联合成。
PLoS Biol. 2010 Jan 26;8(1):e1000294. doi: 10.1371/journal.pbio.1000294.
6
Population genetic inference from genomic sequence variation.从基因组序列变异推断种群遗传
Genome Res. 2010 Mar;20(3):291-300. doi: 10.1101/gr.079509.108. Epub 2010 Jan 12.
9
Common vs. rare allele hypotheses for complex diseases.复杂疾病的常见等位基因与罕见等位基因假说
Curr Opin Genet Dev. 2009 Jun;19(3):212-9. doi: 10.1016/j.gde.2009.04.010. Epub 2009 May 28.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验