Deng Wankun, Wang Yongbo, Ma Lili, Zhang Ying, Ullah Shahid, Xue Yu
Brief Bioinform. 2017 Jul 1;18(4):647-658. doi: 10.1093/bib/bbw041.
Protein methylation is an essential posttranslational modification (PTM) mostly occurs at lysine and arginine residues, and regulates a variety of cellular processes. Owing to the rapid progresses in the large-scale identification of methylation sites, the available data set was dramatically expanded, and more attention has been paid on the identification of specific methylation types of modification residues. Here, we briefly summarized the current progresses in computational prediction of methylation sites, which provided an accurate, rapid and efficient approach in contrast with labor-intensive experiments. We collected 5421 methyllysines and methylarginines in 2592 proteins from the literature, and classified most of the sites into different types. Data analyses demonstrated that different types of methylated proteins were preferentially involved in different biological processes and pathways, whereas a unique sequence preference was observed for each type of methylation sites. Thus, we developed a predictor of GPS-MSP, which can predict mono-, di- and tri-methylation types for specific lysines, and mono-, symmetric di- and asymmetrical di-methylation types for specific arginines. We critically evaluated the performance of GPS-MSP, and compared it with other existing tools. The satisfying results exhibited that the classification of methylation sites into different types for training can considerably improve the prediction accuracy. Taken together, we anticipate that our study provides a new lead for future computational analysis of protein methylation, and the prediction of methylation types of covalently modified lysine and arginine residues can generate more useful information for further experimental manipulation.
蛋白质甲基化是一种重要的翻译后修饰(PTM),主要发生在赖氨酸和精氨酸残基上,并调节多种细胞过程。由于甲基化位点大规模鉴定方面的快速进展,可用数据集大幅扩展,人们对修饰残基特定甲基化类型的鉴定给予了更多关注。在此,我们简要总结了甲基化位点计算预测的当前进展,与劳动密集型实验相比,它提供了一种准确、快速且高效的方法。我们从文献中收集了2592种蛋白质中的5421个甲基赖氨酸和甲基精氨酸,并将大多数位点分类为不同类型。数据分析表明,不同类型的甲基化蛋白质优先参与不同的生物学过程和途径,而每种甲基化位点类型都观察到独特的序列偏好。因此,我们开发了GPS-MSP预测器,它可以预测特定赖氨酸的单甲基化、二甲基化和三甲基化类型,以及特定精氨酸的单甲基化、对称二甲基化和不对称二甲基化类型。我们严格评估了GPS-MSP的性能,并将其与其他现有工具进行了比较。令人满意的结果表明,将甲基化位点分类为不同类型进行训练可以显著提高预测准确性。综上所述,我们预计我们的研究为未来蛋白质甲基化的计算分析提供了新的线索,对共价修饰的赖氨酸和精氨酸残基甲基化类型的预测可以为进一步的实验操作产生更有用的信息。