Xu Yan, Ding Ya-Xin, Ding Jun, Wu Ling-Yun, Deng Nai-Yang
Department of Information and Computer Science, University of Science and Technology, Beijing, China.
Department of Information and Computer Science, University of Science and Technology, Beijing, China.
J Theor Biol. 2015 Aug 21;379:10-5. doi: 10.1016/j.jtbi.2015.04.016. Epub 2015 Apr 24.
Large-scale characterization of post-translational modifications (PTMs), such as posphorylation, acetylation and ubiquitination, has highlighted their importance in the regulation of a myriad of signaling events. However, as another type of PTMs-lysine phosphoglycerylation, the data of phosphoglycerylated sites has just been manually experimented in recent years. Given an uncharacterized protein sequence that contains many lysine residues, which one of them can be phosphoglycerylated and which one not? This is a challenging problem. In view of this, establishing a useful computational method and developing an efficient predictor are highly desired. Here a new predictor named Phogly-PseAAC was developed which incorporated with the position specific amino acid propensity. The feature importance through F-score value has also been ranked. The predictor with the best feature set obtained the accuracy 75.10%, sensitivity 68.87%, specificity 75.57% and MCC 0.2538 in LOO test cross validation with center nearest neighbor algorithm. Meanwhile, a web-server for Phogly-PseAAC is accessible at http://app.aporc.org/Phogly-PseAAC/. For the convenience of most experimental scientists, we have further provided a brief instruction for the web-server, by which users can easily get their desired results without the need to follow the complicated mathematics presented in this paper. It is anticipated that Phogly-PseAAC may become a useful high throughput tool for identifying the lysine phosphoglycerylation sites.
对翻译后修饰(PTM)进行大规模表征,如磷酸化、乙酰化和泛素化,突出了它们在众多信号事件调控中的重要性。然而,作为另一种翻译后修饰类型——赖氨酸磷酸甘油化,磷酸甘油化位点的数据近年来才刚刚经过人工实验。给定一个包含许多赖氨酸残基的未表征蛋白质序列,其中哪些可以被磷酸甘油化,哪些不能?这是一个具有挑战性的问题。鉴于此,非常需要建立一种有用的计算方法并开发一种高效的预测器。在此开发了一种名为Phogly-PseAAC的新预测器,它结合了位置特异性氨基酸倾向。还通过F值对特征重要性进行了排序。在使用中心最近邻算法的留一法交叉验证中,具有最佳特征集的预测器的准确率为75.10%,灵敏度为68.87%,特异性为75.57%,马修斯相关系数为0.2538。同时,可通过http://app.aporc.org/Phogly-PseAAC/访问Phogly-PseAAC的网络服务器。为了方便大多数实验科学家,我们进一步提供了该网络服务器的简要说明,通过它用户可以轻松获得所需结果,而无需遵循本文中呈现的复杂数学。预计Phogly-PseAAC可能成为识别赖氨酸磷酸甘油化位点的有用高通量工具。