College of Computer Science and Technology, China University of Petroleum, Qingdao 266555, China.
State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China.
Int J Mol Sci. 2022 Apr 12;23(8):4263. doi: 10.3390/ijms23084263.
Protein phosphorylation is one of the most critical post-translational modifications of proteins in eukaryotes, which is essential for a variety of biological processes. Plenty of attempts have been made to improve the performance of computational predictors for phosphorylation site prediction. However, most of them are based on extra domain knowledge or feature selection. In this article, we present a novel deep learning-based predictor, named TransPhos, which is constructed using a transformer encoder and densely connected convolutional neural network blocks, for predicting phosphorylation sites. Data experiments are conducted on the datasets of PPA (version 3.0) and Phospho. ELM. The experimental results show that our TransPhos performs better than several deep learning models, including Convolutional Neural Networks (CNN), Long-term and short-term memory networks (LSTM), Recurrent neural networks (RNN) and Fully connected neural networks (FCNN), and some state-of-the-art deep learning-based prediction tools, including GPS2.1, NetPhos, PPRED, Musite, PhosphoSVM, SKIPHOS, and DeepPhos. Our model achieves a good performance on the training datasets of Serine (S), Threonine (T), and Tyrosine (Y), with AUC values of 0.8579, 0.8335, and 0.6953 using 10-fold cross-validation tests, respectively, and demonstrates that the presented TransPhos tool considerably outperforms competing predictors in general protein phosphorylation site prediction.
蛋白质磷酸化是真核生物中蛋白质最重要的翻译后修饰之一,对各种生物过程至关重要。人们已经尝试了很多方法来提高磷酸化位点预测的计算预测器的性能。然而,大多数方法都是基于额外的领域知识或特征选择。在本文中,我们提出了一种新的基于深度学习的预测器 TransPhos,它使用变压器编码器和密集连接卷积神经网络块构建,用于预测磷酸化位点。我们在 PPA(版本 3.0)和 Phospho.ELM 数据集上进行了数据实验。实验结果表明,我们的 TransPhos 比包括卷积神经网络 (CNN)、长短时记忆网络 (LSTM)、递归神经网络 (RNN) 和全连接神经网络 (FCNN) 在内的几种深度学习模型以及 GPS2.1、NetPhos、PPRED、Musite、PhosphoSVM、SKIPHOS 和 DeepPhos 等一些最先进的基于深度学习的预测工具表现更好。我们的模型在丝氨酸 (S)、苏氨酸 (T) 和酪氨酸 (Y) 的训练数据集上表现良好,使用 10 折交叉验证测试的 AUC 值分别为 0.8579、0.8335 和 0.6953,表明所提出的 TransPhos 工具在一般蛋白质磷酸化位点预测方面明显优于竞争预测器。