IEEE/ACM Trans Comput Biol Bioinform. 2020 Nov-Dec;17(6):2053-2061. doi: 10.1109/TCBB.2019.2916038. Epub 2020 Dec 8.
Essential proteins are indispensable for maintaining normal cellular functions. Identification of essential proteins from Protein-protein interaction (PPI) networks has become a hot topic in recent years. Traditionally biological experimental based approaches are time-consuming and expensive, although lots of computational based methods have been developed in the past years; however, the prediction accuracy is still unsatisfied. In this research, by introducing the protein sub-cellular localization information, we define a new measurement for characterizing the protein's subcellular localization essentiality, and a new data fusion based method is developed for identifying essential proteins, named TEGS, based on integrating network topology, gene expression profile, GO annotation information, and protein subcellular localization information. To demonstrate the efficiency of the proposed method TEGS, we evaluate its performance on two Saccharomyces cerevisiae datasets and compare with other seven state-of-the-art methods (DC, BC, NC, PeC, WDC, SON, and TEO) in terms of true predicted number, jackknife curve, and precision-recall curve. Simulation results show that the TEGS outperforms the other compared methods in identifying essential proteins. The source code of TEGS is freely available at https://github.com/wzhangwhu/TEGS.
必需蛋白对于维持正常的细胞功能是不可或缺的。从蛋白质-蛋白质相互作用(PPI)网络中鉴定必需蛋白已成为近年来的一个热门话题。传统的基于生物学实验的方法既耗时又昂贵,尽管过去几年已经开发了许多基于计算的方法,但预测准确性仍然不尽如人意。在这项研究中,通过引入蛋白质亚细胞定位信息,我们定义了一种新的测量方法来描述蛋白质亚细胞定位的必需性,并开发了一种新的数据融合方法 TEGS,用于识别必需蛋白,该方法基于整合网络拓扑、基因表达谱、GO 注释信息和蛋白质亚细胞定位信息。为了证明所提出的 TEGS 方法的效率,我们在两个酿酒酵母数据集上评估了它的性能,并在真阳性预测数量、Jackknife 曲线和精度-召回曲线方面与其他七种最先进的方法(DC、BC、NC、PeC、WDC、SON 和 TEO)进行了比较。模拟结果表明,TEGS 在识别必需蛋白方面优于其他比较方法。TEGS 的源代码可在 https://github.com/wzhangwhu/TEGS 上免费获取。