Peng Wei, Wang Jianxin, Wang Weiping, Liu Qing, Wu Fang-Xiang, Pan Yi
School of Information Science and Engineering, Central South University, Changsha, Hunan 410083, People's Republic of China.
BMC Syst Biol. 2012 Jul 18;6:87. doi: 10.1186/1752-0509-6-87.
Identification of essential proteins plays a significant role in understanding minimal requirements for the cellular survival and development. Many computational methods have been proposed for predicting essential proteins by using the topological features of protein-protein interaction (PPI) networks. However, most of these methods ignored intrinsic biological meaning of proteins. Moreover, PPI data contains many false positives and false negatives. To overcome these limitations, recently many research groups have started to focus on identification of essential proteins by integrating PPI networks with other biological information. However, none of their methods has widely been acknowledged.
By considering the facts that essential proteins are more evolutionarily conserved than nonessential proteins and essential proteins frequently bind each other, we propose an iteration method for predicting essential proteins by integrating the orthology with PPI networks, named by ION. Differently from other methods, ION identifies essential proteins depending on not only the connections between proteins but also their orthologous properties and features of their neighbors. ION is implemented to predict essential proteins in S. cerevisiae. Experimental results show that ION can achieve higher identification accuracy than eight other existing centrality methods in terms of area under the curve (AUC). Moreover, ION identifies a large amount of essential proteins which have been ignored by eight other existing centrality methods because of their low-connectivity. Many proteins ranked in top 100 by ION are both essential and belong to the complexes with certain biological functions. Furthermore, no matter how many reference organisms were selected, ION outperforms all eight other existing centrality methods. While using as many as possible reference organisms can improve the performance of ION. Additionally, ION also shows good prediction performance in E. coli K-12.
The accuracy of predicting essential proteins can be improved by integrating the orthology with PPI networks.
必需蛋白质的识别对于理解细胞生存和发育的最低要求具有重要作用。已经提出了许多计算方法,通过利用蛋白质 - 蛋白质相互作用(PPI)网络的拓扑特征来预测必需蛋白质。然而,这些方法大多忽略了蛋白质的内在生物学意义。此外,PPI数据包含许多假阳性和假阴性。为了克服这些限制,最近许多研究小组开始专注于通过将PPI网络与其他生物学信息整合来识别必需蛋白质。然而,他们的方法都没有得到广泛认可。
考虑到必需蛋白质比非必需蛋白质在进化上更保守且必需蛋白质经常相互结合这一事实,我们提出了一种通过将直系同源关系与PPI网络整合来预测必需蛋白质的迭代方法,命名为ION。与其他方法不同,ION不仅根据蛋白质之间的连接,还根据它们的直系同源属性及其邻居的特征来识别必需蛋白质。ION被用于预测酿酒酵母中的必需蛋白质。实验结果表明,就曲线下面积(AUC)而言,ION比其他八种现有的中心性方法具有更高的识别准确率。此外,ION识别出大量由于连接性低而被其他八种现有的中心性方法忽略的必需蛋白质。许多在ION排名前100的蛋白质既是必需的,又属于具有特定生物学功能的复合物。此外,无论选择多少参考生物体,ION都优于其他八种现有的中心性方法。虽然使用尽可能多的参考生物体可以提高ION的性能。此外,ION在大肠杆菌K - 12中也表现出良好的预测性能。
通过将直系同源关系与PPI网络整合,可以提高必需蛋白质预测的准确性。