Department of Computing, The Hong Kong Polytechnic University, Hung Hom 999077, Hong Kong.
IEEE Trans Biomed Eng. 2012 Apr;59(4):899-908. doi: 10.1109/TBME.2010.2093524. Epub 2010 Nov 18.
Protein molecules interact with each other in protein complexes to perform many vital functions, and different computational techniques have been developed to identify protein complexes in protein-protein interaction (PPI) networks. These techniques are developed to search for subgraphs of high connectivity in PPI networks under the assumption that the proteins in a protein complex are highly interconnected. While these techniques have been shown to be quite effective, it is also possible that the matching rate between the protein complexes they discover and those that are previously determined experimentally be relatively low and the "false-alarm" rate can be relatively high. This is especially the case when the assumption of proteins in protein complexes being more highly interconnected be relatively invalid. To increase the matching rate and reduce the false-alarm rate, we have developed a technique that can work effectively without having to make this assumption. The name of the technique called protein complex identification by discovering functional interdependence (PCIFI) searches for protein complexes in PPI networks by taking into consideration both the functional interdependence relationship between protein molecules and the network topology of the network. The PCIFI works in several steps. The first step is to construct a multiple-function protein network graph by labeling each vertex with one or more of the molecular functions it performs. The second step is to filter out protein interactions between protein pairs that are not functionally interdependent of each other in the statistical sense. The third step is to make use of an information-theoretic measure to determine the strength of the functional interdependence between all remaining interacting protein pairs. Finally, the last step is to try to form protein complexes based on the measure of the strength of functional interdependence and the connectivity between proteins. For performance evaluation, PCIFI was used to identify protein complexes in real PPI network data and the protein complexes it found were matched against those that were previously known in MIPS. The results show that PCIFI can be an effective technique for the identification of protein complexes. The protein complexes it found can match more known protein complexes with a smaller false-alarm rate and can provide useful insights into the understanding of the functional interdependence relationships between proteins in protein complexes.
蛋白质分子在蛋白质复合物中相互作用以执行许多重要功能,已经开发出不同的计算技术来识别蛋白质-蛋白质相互作用(PPI)网络中的蛋白质复合物。这些技术是在假设蛋白质复合物中的蛋白质高度相互连接的情况下,搜索 PPI 网络中高连通性的子图。虽然这些技术已被证明非常有效,但它们发现的蛋白质复合物与先前通过实验确定的蛋白质复合物之间的匹配率可能相对较低,“误报”率可能相对较高。当蛋白质复合物中的蛋白质更高度相互连接的假设相对无效时,尤其如此。为了提高匹配率并降低误报率,我们开发了一种无需做出此假设即可有效工作的技术。该技术称为通过发现功能依赖性进行蛋白质复合物识别(PCIFI),通过考虑蛋白质分子之间的功能依赖性关系和网络的网络拓扑结构,在 PPI 网络中搜索蛋白质复合物。PCIFI 分几个步骤工作。第一步是通过为执行的一个或多个分子功能标记每个顶点来构建多功能蛋白质网络图。第二步是过滤掉在统计意义上彼此之间没有功能依赖性的蛋白质对之间的蛋白质相互作用。第三步是利用信息论度量来确定所有剩余相互作用的蛋白质对之间功能依赖性的强度。最后一步是根据功能依赖性的强度和蛋白质之间的连通性尝试形成蛋白质复合物。为了进行性能评估,PCIFI 用于识别真实 PPI 网络数据中的蛋白质复合物,并且它发现的蛋白质复合物与 MIPS 中先前已知的蛋白质复合物相匹配。结果表明,PCIFI 可以是一种有效的蛋白质复合物识别技术。它发现的蛋白质复合物可以匹配更多已知的蛋白质复合物,并且误报率更低,并且可以为理解蛋白质复合物中蛋白质之间的功能依赖性关系提供有用的见解。