IEEE/ACM Trans Comput Biol Bioinform. 2019 Mar-Apr;16(2):377-387. doi: 10.1109/TCBB.2017.2701824. Epub 2017 May 12.
Essential proteins are critical to the development and survival of cells. Identification of essential proteins is helpful for understanding the minimal set of required genes in a living cell and for designing new drugs. To detect essential proteins, various computational methods have been proposed based on protein-protein interaction (PPI) networks. However, protein interaction data obtained by high-throughput experiments usually contain high false positives, which negatively impacts the accuracy of essential protein detection. Moreover, most existing studies focused on the local information of proteins in PPI networks, while ignoring the influence of indirect protein interactions on essentiality. In this paper, we propose a novel method, called Essentiality Ranking (EssRank in short), to boost the accuracy of essential protein detection. To deal with the inaccuracy of PPI data, confidence scores of interactions are evaluated by integrating various biological information. Weighted edge clustering coefficient (WECC), considering both interaction confidence scores and network topology, is proposed to calculate edge weights in PPI networks. The weight of each node is evaluated by the sum of WECC values of its linking edges. A random walk method, making use of both direct and indirect protein interactions, is then employed to calculate protein essentiality iteratively. Experimental results on the yeast PPI network show that EssRank outperforms most existing methods, including the most commonly-used centrality measures (SC, DC, BC, CC, IC, and EC), topology based methods (DMNC and NC) and the data integrating method IEW.
必需蛋白对于细胞的发育和生存至关重要。鉴定必需蛋白有助于了解活细胞中所需基因的最小集合,并有助于设计新的药物。为了检测必需蛋白,已经提出了各种基于蛋白质-蛋白质相互作用(PPI)网络的计算方法。然而,通过高通量实验获得的蛋白质相互作用数据通常包含大量的假阳性,这会降低必需蛋白检测的准确性。此外,大多数现有研究都集中在 PPI 网络中蛋白质的局部信息上,而忽略了间接蛋白质相互作用对必需性的影响。在本文中,我们提出了一种新的方法,称为EssRank,以提高必需蛋白检测的准确性。为了解决 PPI 数据的不准确性,通过整合各种生物信息来评估相互作用的置信分数。提出了加权边聚类系数(WECC),同时考虑相互作用的置信分数和网络拓扑结构,用于计算 PPI 网络中的边权重。每个节点的权重通过其连接边的 WECC 值的总和来评估。然后利用直接和间接的蛋白质相互作用,采用随机游走方法来迭代计算蛋白质的必需性。在酵母 PPI 网络上的实验结果表明,EssRank 优于大多数现有方法,包括最常用的中心度度量(SC、DC、BC、CC、IC 和 EC)、基于拓扑的方法(DMNC 和 NC)和数据集成方法 IEW。