Graduate School of Science and Technology & NAIST Data Science Center, Nara Institute of Science and Technology, Nara, Japan.
Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Canada.
BMC Bioinformatics. 2018 Jul 13;19(1):264. doi: 10.1186/s12859-018-2251-x.
There are different and complicated associations between genes and diseases. Finding the causal associations between genes and specific diseases is still challenging. In this work we present a method to predict novel associations of genes and pathways with inflammatory bowel disease (IBD) by integrating information of differential gene expression, protein-protein interaction and known disease genes related to IBD.
We downloaded IBD gene expression data from NCBI's Gene Expression Omnibus, performed statistical analysis to determine differentially expressed genes, collected known IBD genes from DisGeNet database, which were used to construct a IBD related PPI network with HIPPIE database. We adapted our graph-based clustering algorithm DPClusO to cluster the disease PPI network. We evaluated the statistical significance of the identified clusters in the context of determining the richness of IBD genes using Fisher's exact test and predicted novel genes related to IBD. We showed 93.8% of our predictions are correct in the context of other databases and published literatures related to IBD.
Finding disease-causing genes is necessary for developing drugs with synergistic effect targeting many genes simultaneously. Here we present an approach to identify novel disease genes and pathways and discuss our approach in the context of IBD. The approach can be generalized to find disease-associated genes for other diseases.
基因与疾病之间存在着不同且复杂的关联。发现基因与特定疾病之间的因果关联仍然具有挑战性。在这项工作中,我们提出了一种通过整合差异基因表达、蛋白质-蛋白质相互作用和与 IBD 相关的已知疾病基因的信息,来预测基因和途径与炎症性肠病(IBD)之间新关联的方法。
我们从 NCBI 的基因表达综合数据库中下载了 IBD 基因表达数据,进行了统计分析以确定差异表达基因,从 DisGeNet 数据库中收集了已知的 IBD 基因,这些基因用于与 HIPPIE 数据库构建 IBD 相关的 PPI 网络。我们适应了我们的基于图的聚类算法 DPClusO 来聚类疾病 PPI 网络。我们使用 Fisher 精确检验评估了在确定 IBD 基因丰富度的背景下,识别出的聚类的统计显著性,并预测了与 IBD 相关的新基因。我们在其他数据库和与 IBD 相关的已发表文献的背景下,展示了我们 93.8%的预测是正确的。
找到致病基因对于开发同时针对多个基因具有协同作用的药物是必要的。在这里,我们提出了一种识别新的疾病基因和途径的方法,并在 IBD 的背景下讨论了我们的方法。该方法可以推广到寻找其他疾病的与疾病相关的基因。