Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute, Systems Biology and Bioinformatics, Jena, 07745, Germany.
Friedrich-Schiller-University, Department of Bioinformatics, Jena, 07743, Germany.
Sci Rep. 2018 Jan 11;8(1):433. doi: 10.1038/s41598-017-18370-2.
The identification of disease-associated modules based on protein-protein interaction networks (PPINs) and gene expression data has provided new insights into the mechanistic nature of diverse diseases. However, their identification is hampered by the detection of protein communities within large-scale, whole-genome PPINs. A presented successful strategy detects a PPIN's community structure based on the maximal clique enumeration problem (MCE), which is a non-deterministic polynomial time-hard problem. This renders the approach computationally challenging for large PPINs implying the need for new strategies. We present ModuleDiscoverer, a novel approach for the identification of regulatory modules from PPINs and gene expression data. Following the MCE-based approach, ModuleDiscoverer uses a randomization heuristic-based approximation of the community structure. Given a PPIN of Rattus norvegicus and public gene expression data, we identify the regulatory module underlying a rodent model of non-alcoholic steatohepatitis (NASH), a severe form of non-alcoholic fatty liver disease (NAFLD). The module is validated using single-nucleotide polymorphism (SNP) data from independent genome-wide association studies and gene enrichment tests. Based on gene enrichment tests, we find that ModuleDiscoverer performs comparably to three existing module-detecting algorithms. However, only our NASH-module is significantly enriched with genes linked to NAFLD-associated SNPs. ModuleDiscoverer is available at http://www.hki-jena.de/index.php/0/2/490 (Others/ModuleDiscoverer).
基于蛋白质-蛋白质相互作用网络(PPINs)和基因表达数据来识别疾病相关模块为研究多种疾病的机制本质提供了新的视角。然而,由于在大规模全基因组 PPINs 中检测蛋白质群落,它们的识别受到了阻碍。提出的一种成功策略基于最大团枚举问题(MCE)来检测 PPIN 的社区结构,这是一个非确定性多项式时间困难问题。这使得该方法在处理大规模 PPINs 时具有计算挑战性,需要新的策略。我们提出了 ModuleDiscoverer,这是一种从 PPIN 和基因表达数据中识别调控模块的新方法。ModuleDiscoverer 遵循基于 MCE 的方法,使用基于随机化启发式的社区结构近似方法。给定一个 Rattus norvegicus 的 PPIN 和公共基因表达数据,我们确定了非酒精性脂肪性肝炎(NASH)的啮齿动物模型的调控模块,NASH 是一种严重的非酒精性脂肪性肝病(NAFLD)。该模块使用来自独立全基因组关联研究和基因富集测试的单核苷酸多态性(SNP)数据进行验证。基于基因富集测试,我们发现 ModuleDiscoverer 与三种现有的模块检测算法表现相当。然而,只有我们的 NASH 模块与与 NAFLD 相关的 SNP 相关的基因显著富集。ModuleDiscoverer 可在 http://www.hki-jena.de/index.php/0/2/490(Others/ModuleDiscoverer)上获取。