IEEE/ACM Trans Comput Biol Bioinform. 2020 Sep-Oct;17(5):1613-1624. doi: 10.1109/TCBB.2019.2907246. Epub 2019 Mar 25.
Pathway enrichment analysis models (PEM) are the premier methods for interpreting gene expression profiles from high-throughput experiments. PEM often use a priori background knowledge to infer the underlying biological functions and mechanisms. A shortcoming of standard PEM is their disregarding of interactions for simplicity, which potentially results in partial and inaccurate inference. In this study, we introduce a graph-based PEM, namely Causal Disturbance Analysis (CADIA), that leverages gene interactions to quantify the topological importance of genes' expression profiles in pathways organizations. In particular, CADIA uses a novel graph centrality model, namely Source/Sink, to measure the topological importance. Source/Sink Centrality quantifies a gene's importance as a receiver and a sender of biological information, which allows for prioritizing the genes that are more likely to disturb a pathways functionality. CADIA infers an enrichment score for a pathway by deriving statistical evidence from Source/Sink centrality of the differentially expressed genes and combines it with classical over-representation analysis. Through real-world experimental and synthetic data evaluations, we show that CADIA can uniquely infer critical pathway enrichments that are not observable through other PEM. Our results indicate that CADIA is sensitive towards topologically central gene-level changes that and provides an informative framework for interpreting high-throughput data.
通路富集分析模型(PEM)是解释高通量实验中基因表达谱的主要方法。PEM 通常使用先验背景知识来推断潜在的生物学功能和机制。标准 PEM 的一个缺点是为了简单而忽略了相互作用,这可能导致部分和不准确的推断。在这项研究中,我们引入了一种基于图的 PEM,即因果干扰分析(CADIA),它利用基因相互作用来量化基因表达谱在通路组织中的拓扑重要性。具体来说,CADIA 使用一种新颖的图中心性模型,即源/汇,来衡量拓扑重要性。源/汇中心性量化了一个基因作为生物信息的接收者和发送者的重要性,这使得优先考虑更有可能干扰通路功能的基因成为可能。CADIA 通过从差异表达基因的源/汇中心性中得出统计证据来推断通路的富集分数,并将其与经典的过度表达分析相结合。通过真实实验和合成数据的评估,我们表明 CADIA 可以独特地推断出其他 PEM 无法观察到的关键通路富集。我们的结果表明,CADIA 对拓扑中心基因水平变化敏感,并为解释高通量数据提供了一个信息丰富的框架。