Tripathi Shailesh, Moutari Salissou, Dehmer Matthias, Emmert-Streib Frank
Predictive Medicine and Analytics Lab, Department of Signal Processing, Tampere University of Technology, Tampere, Finland.
Centre for Statistical Science and Operational Research, School of Mathematics and Physics, Queen's University Belfast, Belfast, UK.
BMC Bioinformatics. 2016 Mar 18;17:129. doi: 10.1186/s12859-016-0979-8.
It is generally acknowledged that a functional understanding of a biological system can only be obtained by an understanding of the collective of molecular interactions in form of biological networks. Protein networks are one particular network type of special importance, because proteins form the functional base units of every biological cell. On a mesoscopic level of protein networks, modules are of significant importance because these building blocks may be the next elementary functional level above individual proteins allowing to gain insight into fundamental organizational principles of biological cells.
In this paper, we provide a comparative analysis of five popular and four novel module detection algorithms. We study these module prediction methods for simulated benchmark networks as well as 10 biological protein interaction networks (PINs). A particular focus of our analysis is placed on the biological meaning of the predicted modules by utilizing the Gene Ontology (GO) database as gold standard for the definition of biological processes. Furthermore, we investigate the robustness of the results by perturbing the PINs simulating in this way our incomplete knowledge of protein networks.
Overall, our study reveals that there is a large heterogeneity among the different module prediction algorithms if one zooms-in the biological level of biological processes in the form of GO terms and all methods are severely affected by a slight perturbation of the networks. However, we also find pathways that are enriched in multiple modules, which could provide important information about the hierarchical organization of the system.
人们普遍认为,只有通过理解生物网络形式的分子相互作用集合,才能获得对生物系统的功能理解。蛋白质网络是一种特别重要的网络类型,因为蛋白质构成了每个生物细胞的功能基本单元。在蛋白质网络的介观层面上,模块非常重要,因为这些构建块可能是高于单个蛋白质的下一个基本功能层面,有助于深入了解生物细胞的基本组织原则。
在本文中,我们对五种流行的和四种新颖的模块检测算法进行了比较分析。我们针对模拟基准网络以及10个生物蛋白质相互作用网络(PINs)研究了这些模块预测方法。我们分析的一个特别重点是通过利用基因本体论(GO)数据库作为生物过程定义的黄金标准,来研究预测模块的生物学意义。此外,我们通过扰动PINs来模拟我们对蛋白质网络的不完全了解,从而研究结果的稳健性。
总体而言,我们的研究表明,如果从基因本体术语形式的生物过程的生物学层面进行深入研究,不同的模块预测算法之间存在很大的异质性,并且所有方法都受到网络轻微扰动的严重影响。然而,我们也发现了多个模块中富集的通路,这可能提供有关系统层次组织的重要信息。