School of Engineering & Physics, University of the South Pacific, Laucala Bay, Suva, Fiji.
Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama 230-0045, Japan.
Genes (Basel). 2020 Dec 20;11(12):1524. doi: 10.3390/genes11121524.
Post-translational modification (PTM) is a biological process that is associated with the modification of proteome, which results in the alteration of normal cell biology and pathogenesis. There have been numerous PTM reports in recent years, out of which, lysine phosphoglycerylation has emerged as one of the recent developments. The traditional methods of identifying phosphoglycerylated residues, which are experimental procedures such as mass spectrometry, have shown to be time-consuming and cost-inefficient, despite the abundance of proteins being sequenced in this post-genomic era. Due to these drawbacks, computational techniques are being sought to establish an effective identification system of phosphoglycerylated lysine residues. The development of a predictor for phosphoglycerylation prediction is not a first, but it is necessary as the latest predictor falls short in adequately detecting phosphoglycerylated and non-phosphoglycerylated lysine residues.
In this work, we introduce a new predictor named RAM-PGK, which uses sequence-based information relating to amino acid residues to predict phosphoglycerylated and non-phosphoglycerylated sites. A benchmark dataset was employed for this purpose, which contained experimentally identified phosphoglycerylated and non-phosphoglycerylated lysine residues. From the dataset, we extracted the residue adjacency matrix pertaining to each lysine residue in the protein sequences and converted them into feature vectors, which is used to build the phosphoglycerylation predictor.
RAM-PGK, which is based on sequential features and support vector machine classifiers, has shown a noteworthy improvement in terms of performance in comparison to some of the recent prediction methods. The performance metrics of the RAM-PGK predictor are: 0.5741 sensitivity, 0.6436 specificity, 0.0531 precision, 0.6414 accuracy, and 0.0824 Mathews correlation coefficient.
翻译后修饰(PTM)是一种与蛋白质组修饰相关的生物学过程,导致正常细胞生物学和发病机制的改变。近年来,已经有许多 PTM 报道,其中赖氨酸磷酸化糖基化是最近的发展之一。识别磷酸化糖基化残基的传统方法,例如质谱等实验程序,虽然在后基因组时代测序的蛋白质数量丰富,但已经证明既耗时又昂贵。由于这些缺点,正在寻求计算技术来建立有效的磷酸化糖基化赖氨酸残基识别系统。开发磷酸化糖基化预测器并不是第一次,但这是必要的,因为最新的预测器在充分检测磷酸化糖基化和非磷酸化糖基化赖氨酸残基方面存在不足。
在这项工作中,我们引入了一个名为 RAM-PGK 的新预测器,该预测器使用与氨基酸残基相关的基于序列的信息来预测磷酸化糖基化和非磷酸化糖基化位点。为此目的使用了基准数据集,其中包含实验鉴定的磷酸化糖基化和非磷酸化糖基化赖氨酸残基。从数据集中,我们提取了与蛋白质序列中每个赖氨酸残基相关的残基邻接矩阵,并将其转换为特征向量,用于构建磷酸化糖基化预测器。
RAM-PGK 基于顺序特征和支持向量机分类器,与一些最近的预测方法相比,在性能方面有了显著的提高。RAM-PGK 预测器的性能指标为:0.5741 灵敏度、0.6436 特异性、0.0531 精度、0.6414 准确性和 0.0824 Matthews 相关系数。