Department of Computer Science and Engineering, University of California San Diego, La Jolla, California, USA.
PLoS Comput Biol. 2010 Oct 14;6(10):e1000954. doi: 10.1371/journal.pcbi.1000954.
Genome wide association (GWA) studies, which test for association between common genetic markers and a disease phenotype, have shown varying degrees of success. While many factors could potentially confound GWA studies, we focus on the possibility that multiple, rare variants (RVs) may act in concert to influence disease etiology. Here, we describe an algorithm for RV analysis, RareCover. The algorithm combines a disparate collection of RVs with low effect and modest penetrance. Further, it does not require the rare variants be adjacent in location. Extensive simulations over a range of assumed penetrance and population attributable risk (PAR) values illustrate the power of our approach over other published methods, including the collapsing and weighted-collapsing strategies. To showcase the method, we apply RareCover to re-sequencing data from a cohort of 289 individuals at the extremes of Body Mass Index distribution (NCT00263042). Individual samples were re-sequenced at two genes, FAAH and MGLL, known to be involved in endocannabinoid metabolism (187Kbp for 148 obese and 150 controls). The RareCover analysis identifies exactly one significantly associated region in each gene, each about 5 Kbp in the upstream regulatory regions. The data suggests that the RVs help disrupt the expression of the two genes, leading to lowered metabolism of the corresponding cannabinoids. Overall, our results point to the power of including RVs in measuring genetic associations.
全基因组关联 (GWA) 研究检测常见遗传标记与疾病表型之间的关联,已取得不同程度的成功。虽然许多因素可能会混淆 GWA 研究,但我们关注的是多个罕见变异 (RVs) 可能协同作用影响疾病病因的可能性。在这里,我们描述了一种用于 RV 分析的算法,即 RareCover。该算法结合了具有低效应和适度外显率的不同罕见变异。此外,它不需要罕见变异在位置上相邻。在一系列假设的外显率和人群归因风险 (PAR) 值下进行的广泛模拟表明,我们的方法比其他已发表的方法更有效,包括压缩和加权压缩策略。为了展示该方法,我们将 RareCover 应用于 289 名个体在体重指数分布两端的重测序数据(NCT00263042)。个体样本在两个已知参与内源性大麻素代谢的基因 FAAH 和 MGLL 上进行了重测序(148 名肥胖者和 150 名对照者的 187Kbp)。RareCover 分析在每个基因中都确定了一个显著相关的区域,每个区域约为 5 Kbp 的上游调控区域。数据表明,RVs 有助于破坏两个基因的表达,导致相应大麻素的代谢降低。总的来说,我们的结果表明,在测量遗传关联时,包括罕见变异具有重要意义。