Suppr超能文献

GPSD:一种用于预测磷酸酶特异性去磷酸化位点的混合学习框架。

GPSD: a hybrid learning framework for the prediction of phosphatase-specific dephosphorylation sites.

作者信息

Han Cheng, Fu Shanshan, Chen Miaomiao, Gou Yujie, Liu Dan, Zhang Chi, Huang Xinhe, Xiao Leming, Zhao Miaoying, Zhang Jiayi, Xiao Qiang, Peng Di, Xue Yu

机构信息

Department of Bioinformatics and Systems Biology, MOE Key Laboratory of Molecular Biophysics, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Luoyu Road 1037, Wuhan, Hubei 430074, China.

School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Luoyu Road 1037, Wuhan, Hubei 430074, China.

出版信息

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae694.

Abstract

Protein phosphorylation is dynamically and reversibly regulated by protein kinases and protein phosphatases, and plays an essential role in orchestrating a wide range of biological processes. Although a number of tools have been developed for predicting kinase-specific phosphorylation sites (p-sites), computational prediction of phosphatase-specific dephosphorylation sites remains to be a great challenge. In this study, we manually curated 4393 experimentally identified site-specific phosphatase-substrate relationships for 3463 dephosphorylation sites occurring on phosphoserine, phosphothreonine, and/or phosphotyrosine residues, from the literature and public databases. Then, we developed a hybrid learning framework, the group-based prediction system for the prediction of phosphatase-specific dephosphorylation sites (GPSD). For model training, we integrated 10 types of sequence features and utilized three types of machine learning methods, including penalized logistic regression, deep neural networks, and transformer neural networks. First, a pretrained model was constructed using 561 416 nonredundant p-sites and then fine-tuned to generate computational models for predicting general dephosphorylation sites. In addition, 103 individual phosphatase-specific predictors were constructed via transfer learning and meta-learning. For site prediction, one or multiple protein sequences in FASTA format could be inputted, and the prediction results will be shown together with additional annotations, such as protein-protein interactions, structural information, and disorder propensity. The online service of GPSD is freely available at https://gpsd.biocuckoo.cn/. We believe that GPSD can serve as a valuable tool for further analysis of dephosphorylation.

摘要

蛋白质磷酸化由蛋白激酶和蛋白磷酸酶动态且可逆地调控,并在协调广泛的生物过程中发挥着至关重要的作用。尽管已经开发了许多工具来预测激酶特异性的磷酸化位点(p位点),但磷酸酶特异性去磷酸化位点的计算预测仍然是一个巨大的挑战。在本研究中,我们从文献和公共数据库中手动整理了4393个实验确定的位点特异性磷酸酶 - 底物关系,这些关系涉及丝氨酸、苏氨酸和/或酪氨酸残基上发生的3463个去磷酸化位点。然后,我们开发了一个混合学习框架,即基于组的磷酸酶特异性去磷酸化位点预测系统(GPSD)。对于模型训练,我们整合了10种类型的序列特征,并使用了三种类型的机器学习方法,包括惩罚逻辑回归、深度神经网络和变换器神经网络。首先,使用561416个非冗余的p位点构建一个预训练模型,并对其进行微调以生成用于预测一般去磷酸化位点的计算模型。此外,通过迁移学习和元学习构建了103个个体磷酸酶特异性预测器。对于位点预测,可以输入一个或多个FASTA格式的蛋白质序列,预测结果将与其他注释一起显示,如蛋白质 - 蛋白质相互作用、结构信息和无序倾向。GPSD的在线服务可在https://gpsd.biocuckoo.cn/免费获取。我们相信GPSD可以作为进一步分析去磷酸化的有价值工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/34c8/11695897/9f3e9f8166a4/bbae694f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验