Tian Wenhong, Samatova Nagiza F
Department of Computer and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA.
Pac Symp Biocomput. 2009:99-110.
A number of tools for the alignment of protein-protein interaction (PPI) networks have laid the foundation for PPI network analysis. They typically find conserved interaction patterns by various local or global search algorithms, and then validate the results using genome annotation. The improvement of the speed, scalability and accuracy of network alignment is still the target of ongoing research. In view of this, we introduce a connected-components based algorithm, called HopeMap for pairwise network alignment with the focus on fast identification of maximal conserved patterns across species. Observing that the number of true homologs across species is relatively small compared to the total number of proteins in all species, we start with highly homologous groups across species, find maximal conserved interaction patterns globally with a generic scoring system, and validate the results across multiple known functional annotations. The results are evaluated in terms of statistical enrichment of gene ontology (GO) terms and KEGG ortholog groups (KO) within conserved interaction patters. HopeMap is fast, with linear computational cost, accurate in terms of KO groups and GO terms specificity and sensitivity, and extensible to multiple network alignment.
许多用于蛋白质-蛋白质相互作用(PPI)网络比对的工具为PPI网络分析奠定了基础。它们通常通过各种局部或全局搜索算法找到保守的相互作用模式,然后使用基因组注释来验证结果。网络比对在速度、可扩展性和准确性方面的提升仍然是当前研究的目标。鉴于此,我们引入了一种基于连通分量的算法,称为HopeMap,用于成对网络比对,重点是快速识别跨物种的最大保守模式。鉴于跨物种的真正同源物数量与所有物种中蛋白质的总数相比相对较少,我们从跨物种的高度同源组开始,使用通用评分系统全局找到最大保守相互作用模式,并通过多个已知功能注释来验证结果。结果根据保守相互作用模式内基因本体(GO)术语和KEGG直系同源组(KO)的统计富集情况进行评估。HopeMap速度快,计算成本呈线性,在KO组和GO术语的特异性和敏感性方面准确,并且可扩展到多个网络比对。