Wang MingHui, Li ChunHua, Chen WeiZu, Wang CunXin
College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100022, China.
Sci China C Life Sci. 2008 Jan;51(1):12-20. doi: 10.1007/s11427-008-0012-1.
Phosphorylation is a crucial way to control the activity of proteins in many eukaryotic organisms in vivo. Experimental methods to determine phosphorylation sites in substrates are usually restricted by the in vitro condition of enzymes and very intensive in time and labor. Although some in silico methods and web servers have been introduced for automatic detection of phosphorylation sites, sophisticated methods are still in urgent demand to further improve prediction performances. Protein primary sequences can help predict phosphorylation sites catalyzed by different protein kinase and most computational approaches use a short local peptide to make prediction. However, the useful information may be lost if only the conservative residues that are not close to the phosphorylation site are considered in prediction, which would hamper the prediction results. A novel prediction method named IEPP (Information-Entropy based Phosphorylation Prediction) is presented in this paper for automatic detection of potential phosphorylation sites. In prediction, the sites around the phosphorylation sites are selected or excluded by their entropy values. The algorithm was compared with other methods such as GSP and PPSP on the ABL, MAPK and PKA PK families. The superior prediction accuracies were obtained in various measurements such as sensitivity (Sn) and specificity (Sp). Furthermore, compared with some online prediction web servers on the new discovered phosphorylation sites, IEPP also yielded the best performance. IEPP is another useful computational resource for identification of PK-specific phosphorylation sites and it also has the advantages of simpleness, efficiency and convenience.
磷酸化是许多真核生物体内控制蛋白质活性的关键方式。确定底物中磷酸化位点的实验方法通常受限于酶的体外条件,且耗时费力。尽管已经引入了一些计算机方法和网络服务器来自动检测磷酸化位点,但仍迫切需要复杂的方法来进一步提高预测性能。蛋白质一级序列有助于预测不同蛋白激酶催化的磷酸化位点,大多数计算方法使用短的局部肽段进行预测。然而,如果在预测中仅考虑不靠近磷酸化位点的保守残基,可能会丢失有用信息,这会影响预测结果。本文提出了一种名为IEPP(基于信息熵的磷酸化预测)的新预测方法,用于自动检测潜在的磷酸化位点。在预测过程中,根据磷酸化位点周围位点的熵值来选择或排除这些位点。该算法在ABL、MAPK和PKA激酶家族上与GSP和PPSP等其他方法进行了比较。在诸如灵敏度(Sn)和特异性(Sp)等各种测量中获得了更高的预测准确率。此外,与一些关于新发现的磷酸化位点的在线预测网络服务器相比,IEPP也表现出最佳性能。IEPP是另一种用于识别激酶特异性磷酸化位点的有用计算资源,它还具有简单、高效和便捷的优点。