Department of Math, University of Wisconsin-Madison, 480 Lincoln Drive, Madison, WI, 53706, USA.
Department of Population Health Sciences, University of Wisconsin-Madison, 610 Walnut Street, Madison, WI, 53726, USA.
Gigascience. 2020 Oct 9;9(10). doi: 10.1093/gigascience/giaa091.
Gene-set analyses measure the association between a disease of interest and a "set" of genes related to a biological pathway. These analyses often incorporate gene network properties to account for differential contributions of each gene. We extend this concept further-defining gene contributions based on biophysical properties-by leveraging mathematical models of biology to predict the effects of genetic perturbations on a particular downstream function.
We present a method that combines gene weights from model predictions and gene ranks from genome-wide association studies into a weighted gene-set test. We demonstrate in simulation how such a method can improve statistical power. To this effect, we identify a gene set, weighted by model-predicted contributions to intracellular calcium ion concentration, that is significantly related to bipolar disorder in a small dataset (P = 0.04; n = 544). We reproduce this finding using publicly available summary data from the Psychiatric Genomics Consortium (P = 1.7 × 10-4; n = 41,653). By contrast, an approach using a general calcium signaling pathway did not detect a significant association with bipolar disorder (P = 0.08). The weighted gene-set approach based on intracellular calcium ion concentration did not detect a significant relationship with schizophrenia (P = 0.09; n = 65,967) or major depression disorder (P = 0.30; n = 500,199).
Together, these findings show how incorporating math biology into gene-set analyses might help to identify biological functions that underlie certain polygenic disorders.
基因集分析旨在衡量感兴趣的疾病与生物途径相关的“基因集”之间的关联。这些分析通常整合基因网络特性,以解释每个基因的不同贡献。我们进一步扩展了这一概念,根据生物物理特性来定义基因的贡献,利用生物学的数学模型来预测遗传扰动对特定下游功能的影响。
我们提出了一种方法,将来自模型预测的基因权重和来自全基因组关联研究的基因排名结合到一个加权基因集测试中。我们通过模拟展示了这种方法如何提高统计能力。为此,我们确定了一个基因集,该基因集的权重由对细胞内钙离子浓度的模型预测贡献加权,在一个小数据集(n = 544)中与双相情感障碍显著相关(P = 0.04)。我们使用精神病学基因组学联盟(Psychiatric Genomics Consortium)提供的公开可用汇总数据(n = 41653)重现了这一发现(P = 1.7×10-4)。相比之下,使用一般的钙信号通路的方法未检测到与双相情感障碍显著相关(P = 0.08)。基于细胞内钙离子浓度的加权基因集方法未检测到与精神分裂症(P = 0.09;n = 65967)或重度抑郁症(P = 0.30;n = 500199)显著相关。
综上所述,这些发现表明,将数学生物学纳入基因集分析可能有助于识别某些多基因疾病的潜在生物学功能。