Department of Computer Science, Purdue University, West Lafayette, IN, USA.
BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S35. doi: 10.1186/1471-2105-11-S1-S35.
Analyzing interaction networks for functional characterization poses significant challenges arising from the noisy, incomplete, and generic nature of both the interaction data as well as functional annotation of molecules. Network-based methods focus on interacting molecules (pairs or sets) occurring in close proximity to infer functional associations.
In this paper we perform a formal comparative investigation of the relationship between functional coherence and topological proximity in networks. We investigate the problem of assessing the coherence of sets of biomolecules (or segments thereof) taking into account functional specificity as well as the distribution of functional attributes across entity groups. We also propose novel measures of topological proximity that are more robust to noisy and incomplete interaction data.
We derive the following results in this paper: (i) there exists strong correlation between functional similarity and topological proximity in various network abstractions, with domain interaction networks (DDIs) demonstrating higher correlation than protein interaction networks (PPIs); (ii) measures that quantify coherence among entire sets of proteins are superior to aggregates of known pair-wise measures; and (iii) random-walk based measures of topological proximity are better suited to existing interaction data. We validate our methods on diverse data, including experimentally and computationally derived PPIs and DDIs, as well as on sets of known biologically related groups of molecules.
分析功能特征的相互作用网络,由于相互作用数据以及分子功能注释的噪声、不完整和通用性,会带来很大的挑战。基于网络的方法侧重于发生在接近的相互作用分子(对或集),以推断功能关联。
在本文中,我们对网络中功能一致性和拓扑接近性之间的关系进行了正式的比较研究。我们研究了考虑功能特异性以及功能属性在实体组之间分布的情况下,评估生物分子(或其片段)集的一致性的问题。我们还提出了新的拓扑接近度度量方法,这些方法对噪声和不完整的交互数据更具鲁棒性。
在本文中,我们得出以下结果:(i)在各种网络抽象中,功能相似性与拓扑接近性之间存在很强的相关性,域交互网络(DDI)比蛋白质相互作用网络(PPI)具有更高的相关性;(ii)量化整个蛋白质集之间一致性的度量方法优于已知的成对度量方法的聚合;(iii)基于随机游走的拓扑接近度度量方法更适合现有的交互数据。我们在各种数据上验证了我们的方法,包括实验和计算得出的 PPI 和 DDI,以及已知的具有生物学相关性的分子集。