Luo Jiawei, Wu Juan
Int J Data Min Bioinform. 2015;12(3):257-74. doi: 10.1504/ijdmb.2015.069654.
Essential proteins provide valuable information for the development of biology and medical research from the system level. The accuracy of topological centrality only based methods is deeply affected by noise in the network. Therefore, exploring efficient methods for identifying essential proteins would be of great value. Using biological features to identify essential proteins is efficient in reducing the noise in PPI network. In this paper, based on the consideration that essential proteins evolve slowly and play a central role within a network, a new algorithm, named CED, is proposed. CED mainly employs gene expression level, protein complex information and edge clustering coefficient to predict essential proteins. The performance of CED is validated based on the yeast Protein-Protein Interaction (PPI) network obtained from DIP database and BioGRID database. The prediction accuracy of CED outperforms other seven algorithms when applied to the two databases.
必需蛋白质从系统层面为生物学和医学研究的发展提供了有价值的信息。仅基于拓扑中心性的方法的准确性会受到网络中噪声的严重影响。因此,探索识别必需蛋白质的有效方法具有重要价值。利用生物学特征识别必需蛋白质在降低蛋白质-蛋白质相互作用(PPI)网络中的噪声方面是有效的。本文基于必需蛋白质进化缓慢且在网络中起核心作用这一考虑,提出了一种名为CED的新算法。CED主要利用基因表达水平、蛋白质复合物信息和边聚类系数来预测必需蛋白质。基于从DIP数据库和BioGRID数据库获得的酵母蛋白质-蛋白质相互作用(PPI)网络对CED的性能进行了验证。当应用于这两个数据库时,CED的预测准确性优于其他七种算法。