Department of Biochemistry and Biophysics, University of California at San Francisco, San Francisco, CA 94143, USA.
Am J Hum Genet. 2013 May 2;92(5):667-80. doi: 10.1016/j.ajhg.2013.03.022.
Genetic mapping of complex diseases to date depends on variations inside or close to the genes that perturb their activities. A strong body of evidence suggests that changes in gene expression play a key role in complex diseases and that numerous loci perturb gene expression in trans. The information in trans variants, however, has largely been ignored in the current analysis paradigm. Here we present a statistical framework for genetic mapping by utilizing collective information in both cis and trans variants. We reason that for a disease-associated gene, any genetic variation that perturbs its expression is also likely to influence the disease risk. Thus, the expression quantitative trait loci (eQTL) of the gene, which constitute a unique "genetic signature," should overlap significantly with the set of loci associated with the disease. We translate this idea into a computational algorithm (named Sherlock) to search for gene-disease associations from GWASs, taking advantage of independent eQTL data. Application of this strategy to Crohn disease and type 2 diabetes predicts a number of genes with possible disease roles, including several predictions supported by solid experimental evidence. Importantly, predicted genes are often implicated by multiple trans eQTL with moderate associations. These genes are far from any GWAS association signals and thus cannot be identified from the GWAS alone. Our approach allows analysis of association data from a new perspective and is applicable to any complex phenotype. It is readily generalizable to molecular traits other than gene expression, such as metabolites, noncoding RNAs, and epigenetic modifications.
迄今为止,复杂疾病的遗传定位依赖于基因内部或附近的变异,这些变异会干扰基因的活性。大量证据表明,基因表达的变化在复杂疾病中起着关键作用,并且许多基因座在转录中干扰基因表达。然而,在当前的分析范式中,转座变异中的信息在很大程度上被忽视了。在这里,我们提出了一种利用顺式和转座变异的集体信息进行遗传定位的统计框架。我们认为,对于一个与疾病相关的基因,任何干扰其表达的遗传变异也可能影响疾病风险。因此,该基因的表达数量性状基因座(eQTL),构成了一个独特的“遗传特征”,应该与与疾病相关的基因座集显著重叠。我们将这个想法转化为一种计算算法(名为 Sherlock),从 GWAS 中搜索基因-疾病关联,利用独立的 eQTL 数据。将这一策略应用于克罗恩病和 2 型糖尿病,预测了一些可能具有疾病作用的基因,其中包括一些有确凿实验证据支持的预测。重要的是,预测的基因通常与多个具有中等关联的转座 eQTL 有关。这些基因远离任何 GWAS 关联信号,因此无法仅从 GWAS 中识别。我们的方法允许从新的角度分析关联数据,并且适用于任何复杂的表型。它可以很容易地推广到除基因表达以外的分子特征,如代谢物、非编码 RNA 和表观遗传修饰。