Department of Mathematics and Computer Science, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, Germany.
Bioinformatics. 2015 Feb 1;31(3):363-72. doi: 10.1093/bioinformatics/btu652. Epub 2014 Oct 4.
Sequences and protein interaction data are of significance to understand the underlying molecular mechanism of organisms. Local network alignment is one of key systematic ways for predicting protein functions, identifying functional modules and understanding the phylogeny from these data. Most of currently existing tools, however, encounter their limitations, which are mainly concerned with scoring scheme, speed and scalability. Therefore, there are growing demands for sophisticated network evolution models and efficient local alignment algorithms.
We developed a fast and scalable local network alignment tool called LocalAli for the identification of functionally conserved modules in multiple networks. In this algorithm, we firstly proposed a new framework to reconstruct the evolution history of conserved modules based on a maximum-parsimony evolutionary model. By relying on this model, LocalAli facilitates interpretation of resulting local alignments in terms of conserved modules, which have been evolved from a common ancestral module through a series of evolutionary events. A meta-heuristic method simulated annealing was used to search for the optimal or near-optimal inner nodes (i.e. ancestral modules) of the evolutionary tree. To evaluate the performance and the statistical significance, LocalAli were tested on 26 real datasets and 1040 randomly generated datasets. The results suggest that LocalAli outperforms all existing algorithms in terms of coverage, consistency and scalability, meanwhile retains a high precision in the identification of functionally coherent subnetworks.
The source code and test datasets are freely available for download under the GNU GPL v3 license at https://code.google.com/p/localali/.
jialu.hu@fu-berlin.de or knut.reinert@fu-berlin.de.
Supplementary data are available at Bioinformatics online.
序列和蛋白质相互作用数据对于理解生物的基本分子机制具有重要意义。局部网络比对是预测蛋白质功能、识别功能模块以及从这些数据中了解系统发育的关键系统方法之一。然而,目前大多数现有的工具都存在其局限性,主要涉及评分方案、速度和可扩展性。因此,人们对复杂的网络进化模型和高效的局部比对算法的需求日益增长。
我们开发了一种名为 LocalAli 的快速可扩展的局部网络比对工具,用于识别多个网络中的功能保守模块。在这个算法中,我们首先提出了一种新的框架,基于最大简约进化模型来重建保守模块的进化历史。通过依赖这个模型,LocalAli 有助于根据保守模块解释产生的局部比对,这些模块是通过一系列进化事件从一个共同的祖先模块进化而来的。一种元启发式方法模拟退火被用于搜索进化树的最优或近最优内部节点(即祖先模块)。为了评估性能和统计显著性,LocalAli 在 26 个真实数据集和 1040 个随机生成的数据集上进行了测试。结果表明,LocalAli 在覆盖率、一致性和可扩展性方面均优于所有现有的算法,同时在识别功能一致的子网方面保持了较高的精度。
源代码和测试数据集可在 GNU GPL v3 许可证下免费下载,网址为 https://code.google.com/p/localali/。
jialu.hu@fu-berlin.de 或 knut.reinert@fu-berlin.de。
补充数据可在 Bioinformatics 在线获取。