Segura Joan, Sorzano C O S, Cuenca-Alba Jesus, Aloy Patrick, Carazo J M
GN7 of the National Institute for Bioinformatics (INB) and Biocomputing Unit, National Center of Biotechnology (CSIC), c/ Darwin no 3, Campus of Cantoblanco, 28049, Madrid, Spain.
Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), c/ Baldiri Reixac 10-12, 08028, Barcelona, Spain and Institució Catalana de Recerca i Estudis Avançats (ICREA), Pg. Lluís Companys 23, 08010, Barcelona, Spain.
Bioinformatics. 2015 Aug 1;31(15):2545-52. doi: 10.1093/bioinformatics/btv188. Epub 2015 Apr 2.
In recent years, large-scale studies have been undertaken to describe, at least partially, protein-protein interaction maps, or interactomes, for a number of relevant organisms, including human. However, current interactomes provide a somehow limited picture of the molecular details involving protein interactions, mostly because essential experimental information, especially structural data, is lacking. Indeed, the gap between structural and interactomics information is enlarging and thus, for most interactions, key experimental information is missing. We elaborate on the observation that many interactions between proteins involve a pair of their constituent domains and, thus, the knowledge of how protein domains interact adds very significant information to any interactomic analysis.
In this work, we describe a novel use of the neighborhood cohesiveness property to infer interactions between protein domains given a protein interaction network. We have shown that some clustering coefficients can be extended to measure a degree of cohesiveness between two sets of nodes within a network. Specifically, we used the meet/min coefficient to measure the proportion of interacting nodes between two sets of nodes and the fraction of common neighbors. This approach extends previous works where homolog coefficients were first defined around network nodes and later around edges. The proposed approach substantially increases both the number of predicted domain-domain interactions as well as its accuracy as compared with current methods.
近年来,已经开展了大规模研究,以至少部分地描述包括人类在内的许多相关生物体的蛋白质 - 蛋白质相互作用图谱,即相互作用组。然而,当前的相互作用组在涉及蛋白质相互作用的分子细节方面提供的信息有限,主要是因为缺乏关键的实验信息,尤其是结构数据。实际上,结构信息和相互作用组学信息之间的差距正在扩大,因此,对于大多数相互作用而言,关键的实验信息缺失。我们详细阐述了这样一个观察结果,即蛋白质之间的许多相互作用涉及其一对组成结构域,因此,了解蛋白质结构域如何相互作用会为任何相互作用组学分析增添非常重要的信息。
在这项工作中,我们描述了邻域凝聚性属性的一种新用途,即在给定蛋白质相互作用网络的情况下推断蛋白质结构域之间的相互作用。我们已经表明,一些聚类系数可以扩展用于测量网络中两组节点之间的凝聚程度。具体而言,我们使用“交集/最小”系数来测量两组节点之间相互作用节点的比例以及共同邻居的比例。该方法扩展了先前的工作,在先前的工作中,同源系数首先围绕网络节点定义,后来围绕边定义。与当前方法相比,所提出的方法显著增加了预测结构域 - 结构域相互作用的数量及其准确性。