Suppr超能文献

基于基因的罕见变异方法在检测疾病相关变异以及检验关于复杂疾病的假设方面的能力。

The power of gene-based rare variant methods to detect disease-associated variation and test hypotheses about complex disease.

作者信息

Moutsianas Loukas, Agarwala Vineeta, Fuchsberger Christian, Flannick Jason, Rivas Manuel A, Gaulton Kyle J, Albers Patrick K, McVean Gil, Boehnke Michael, Altshuler David, McCarthy Mark I

机构信息

Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom.

Program in Biophysics, Harvard University, Cambridge, Massachusetts, United States of America; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America.

出版信息

PLoS Genet. 2015 Apr 23;11(4):e1005165. doi: 10.1371/journal.pgen.1005165. eCollection 2015 Apr.

Abstract

Genome and exome sequencing in large cohorts enables characterization of the role of rare variation in complex diseases. Success in this endeavor, however, requires investigators to test a diverse array of genetic hypotheses which differ in the number, frequency and effect sizes of underlying causal variants. In this study, we evaluated the power of gene-based association methods to interrogate such hypotheses, and examined the implications for study design. We developed a flexible simulation approach, using 1000 Genomes data, to (a) generate sequence variation at human genes in up to 10K case-control samples, and (b) quantify the statistical power of a panel of widely used gene-based association tests under a variety of allelic architectures, locus effect sizes, and significance thresholds. For loci explaining 1% of phenotypic variance underlying a common dichotomous trait, we find that all methods have low absolute power to achieve exome-wide significance (5-20% power at α = 2.5 × 10(-6)) in 3K individuals; even in 10K samples, power is modest (~60%). The combined application of multiple methods increases sensitivity, but does so at the expense of a higher false positive rate. MiST, SKAT-O, and KBAC have the highest individual mean power across simulated datasets, but we observe wide architecture-dependent variability in the individual loci detected by each test, suggesting that inferences about disease architecture from analysis of sequencing studies can differ depending on which methods are used. Our results imply that tens of thousands of individuals, extensive functional annotation, or highly targeted hypothesis testing will be required to confidently detect or exclude rare variant signals at complex disease loci.

摘要

对大规模队列进行基因组和外显子组测序能够表征罕见变异在复杂疾病中的作用。然而,要在这一努力中取得成功,研究人员需要检验一系列不同的遗传假设,这些假设在潜在因果变异的数量、频率和效应大小方面存在差异。在本研究中,我们评估了基于基因的关联方法检验此类假设的效能,并研究了其对研究设计的影响。我们开发了一种灵活的模拟方法,利用千人基因组计划的数据,来(a)在多达10K个病例对照样本中生成人类基因的序列变异,以及(b)在各种等位基因结构、基因座效应大小和显著性阈值下,量化一组广泛使用的基于基因的关联检验的统计效能。对于解释常见二分性状潜在表型变异约1%的基因座,我们发现在3K个体中,所有方法达到外显子组水平显著性(在α = 2.5 × 10⁻⁶时效能约为5 - 20%)的绝对效能都很低;即使在10K个样本中,效能也一般(约60%)。多种方法的联合应用可提高敏感性,但代价是假阳性率更高。MiST、SKAT - O和KBAC在模拟数据集中的个体平均效能最高,但我们观察到每个检验所检测到的个体基因座存在广泛的结构依赖性变异性,这表明根据所使用的方法不同,从测序研究分析中对疾病结构的推断可能会有所不同。我们的结果表明,要可靠地检测或排除复杂疾病基因座处的罕见变异信号,将需要数万人、广泛的功能注释或高度针对性的假设检验。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a13a/4407972/18943ff77392/pgen.1005165.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验