Chua Hon Nian, Sung Wing-Kin, Wong Limsoon
Graduate School for Integrated Sciences and Engineering, National University of Singapore, Singapore.
Bioinformatics. 2006 Jul 1;22(13):1623-30. doi: 10.1093/bioinformatics/btl145. Epub 2006 Apr 21.
Most approaches in predicting protein function from protein-protein interaction data utilize the observation that a protein often share functions with proteins that interacts with it (its level-1 neighbours). However, proteins that interact with the same proteins (i.e. level-2 neighbours) may also have a greater likelihood of sharing similar physical or biochemical characteristics. We speculate that functional similarity between a protein and its neighbours from the two different levels arise from two distinct forms of functional association, and a protein is likely to share functions with its level-1 and/or level-2 neighbours. We are interested in finding out how significant is functional association between level-2 neighbours and how they can be exploited for protein function prediction.
We made a statistical study on recent interaction data and observed that functional association between level-2 neighbours is clearly observable. A substantial number of proteins are observed to share functions with level-2 neighbours but not with level-1 neighbours. We develop an algorithm that predicts the functions of a protein in two steps: (1) assign a weight to each of its level-1 and level-2 neighbours by estimating its functional similarity with the protein using the local topology of the interaction network as well as the reliability of experimental sources and (2) scoring each function based on its weighted frequency in these neighbours. Using leave-one-out cross validation, we compare the performance of our method against that of several other existing approaches and show that our method performs relatively well.
大多数从蛋白质-蛋白质相互作用数据预测蛋白质功能的方法都利用了这样一个观察结果,即一种蛋白质通常与其相互作用的蛋白质(其一级邻居)共享功能。然而,与相同蛋白质相互作用的蛋白质(即二级邻居)也可能更有可能共享相似的物理或生化特征。我们推测,蛋白质与其来自两个不同层次的邻居之间的功能相似性源于两种不同形式的功能关联,并且一种蛋白质可能与其一级和/或二级邻居共享功能。我们有兴趣了解二级邻居之间的功能关联有多显著,以及如何利用它们进行蛋白质功能预测。
我们对最近的相互作用数据进行了统计研究,观察到二级邻居之间的功能关联是明显可观察到的。大量蛋白质被观察到与二级邻居共享功能,但与一级邻居不共享功能。我们开发了一种算法,分两步预测蛋白质的功能:(1)通过使用相互作用网络的局部拓扑结构以及实验来源的可靠性来估计其与蛋白质的功能相似性,为其每个一级和二级邻居分配一个权重;(2)根据这些邻居中其加权频率对每个功能进行评分。使用留一法交叉验证,我们将我们方法的性能与其他几种现有方法的性能进行了比较,结果表明我们的方法表现相对较好。