Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America.
PLoS Comput Biol. 2019 Mar 20;15(3):e1006864. doi: 10.1371/journal.pcbi.1006864. eCollection 2019 Mar.
Basal gene expression levels have been shown to be predictive of cellular response to cytotoxic treatments. However, such analyses do not fully reveal complex genotype- phenotype relationships, which are partly encoded in highly interconnected molecular networks. Biological pathways provide a complementary way of understanding drug response variation among individuals. In this study, we integrate chemosensitivity data from a large-scale pharmacogenomics study with basal gene expression data from the CCLE project and prior knowledge of molecular networks to identify specific pathways mediating chemical response. We first develop a computational method called PACER, which ranks pathways for enrichment in a given set of genes using a novel network embedding method. It examines a molecular network that encodes known gene-gene as well as gene-pathway relationships, and determines a vector representation of each gene and pathway in the same low-dimensional vector space. The relevance of a pathway to the given gene set is then captured by the similarity between the pathway vector and gene vectors. To apply this approach to chemosensitivity data, we identify genes whose basal expression levels in a panel of cell lines are correlated with cytotoxic response to a compound, and then rank pathways for relevance to these response-correlated genes using PACER. Extensive evaluation of this approach on benchmarks constructed from databases of compound target genes and large collections of drug response signatures demonstrates its advantages in identifying compound-pathway associations compared to existing statistical methods of pathway enrichment analysis. The associations identified by PACER can serve as testable hypotheses on chemosensitivity pathways and help further study the mechanisms of action of specific cytotoxic drugs. More broadly, PACER represents a novel technique of identifying enriched properties of any gene set of interest while also taking into account networks of known gene-gene relationships and interactions.
基础基因表达水平已被证明可预测细胞对细胞毒性治疗的反应。然而,此类分析并未完全揭示复杂的基因型-表型关系,这些关系部分编码在高度互联的分子网络中。生物途径为理解个体间药物反应的变化提供了一种补充方法。在这项研究中,我们将大规模药物基因组学研究中的化学敏感性数据与 CCLE 项目中的基础基因表达数据以及分子网络的先验知识相结合,以确定介导化学反应的特定途径。我们首先开发了一种名为 PACER 的计算方法,该方法使用一种新的网络嵌入方法对给定基因集的途径进行富集排名。它检查了一个分子网络,该网络编码了已知的基因-基因以及基因-途径关系,并确定了每个基因和途径在同一低维向量空间中的向量表示。然后,通过途径向量与基因向量之间的相似性来捕获途径与给定基因集的相关性。为了将这种方法应用于化学敏感性数据,我们识别出细胞系中基础表达水平与化合物细胞毒性反应相关的基因,然后使用 PACER 对与这些反应相关基因相关的途径进行相关性排名。通过比较基准构建的数据库中的化合物靶基因和大量药物反应特征的大集合,对该方法进行了广泛的评估,结果表明,与现有的途径富集分析统计方法相比,该方法在识别化合物-途径关联方面具有优势。PACER 识别出的关联可以作为化学敏感性途径的可测试假设,并有助于进一步研究特定细胞毒性药物的作用机制。更广泛地说,PACER 代表了一种识别任何感兴趣的基因集的丰富特性的新技术,同时还考虑了已知的基因-基因关系和相互作用网络。