蛋白质内部丝氨酸翻译后丙酮酰残基修饰位点的预测与分析

Prediction and Analysis of Post-Translational Pyruvoyl Residue Modification Sites from Internal Serines in Proteins.

作者信息

Jiang Yang, Li Bi-Qing, Zhang Yuchao, Feng Yuan-Ming, Gao Yu-Fei, Zhang Ning, Cai Yu-Dong

机构信息

Department of Surgery, China-Japan Union Hospital of Jilin University, Changchun, P. R. China.

出版信息

PLoS One. 2013 Jun 21;8(6):e66678. doi: 10.1371/journal.pone.0066678. Print 2013.

DOI:10.1371/journal.pone.0066678

PMID:23805260

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3689656/

Abstract

Most of pyruvoyl-dependent proteins observed in prokaryotes and eukaryotes are critical regulatory enzymes, which are primary targets of inhibitors for anti-cancer and anti-parasitic therapy. These proteins undergo an autocatalytic, intramolecular self-cleavage reaction in which a covalently bound pyruvoyl group is generated on a conserved serine residue. Traditional detections of the modified serine sites are performed by experimental approaches, which are often labor-intensive and time-consuming. In this study, we initiated in an attempt for the computational predictions of such serine sites with Feature Selection based on a Random Forest. Since only a small number of experimentally verified pyruvoyl-modified proteins are collected in the protein database at its current version, we only used a small dataset in this study. After removing proteins with sequence identities >60%, a non-redundant dataset was generated and was used, which contained only 46 proteins, with one pyruvoyl serine site for each protein. Several types of features were considered in our method including PSSM conservation scores, disorders, secondary structures, solvent accessibilities, amino acid factors and amino acid occurrence frequencies. As a result, a pretty good performance was achieved in our dataset. The best 100.00% accuracy and 1.0000 MCC value were obtained from the training dataset, and 93.75% accuracy and 0.8441 MCC value from the testing dataset. The optimal feature set contained 9 features. Analysis of the optimal feature set indicated the important roles of some specific features in determining the pyruvoyl-group-serine sites, which were consistent with several results of earlier experimental studies. These selected features may shed some light on the in-depth understanding of the mechanism of the post-translational self-maturation process, providing guidelines for experimental validation. Future work should be made as more pyruvoyl-modified proteins are found and the method should be evaluated on larger datasets. At last, the predicting software can be downloaded from http://www.nkbiox.com/sub/pyrupred/index.html.

摘要

在原核生物和真核生物中观察到的大多数依赖丙酮酰的蛋白质都是关键的调节酶，它们是抗癌和抗寄生虫治疗中抑制剂的主要作用靶点。这些蛋白质会经历一种自催化的分子内自我切割反应，在一个保守的丝氨酸残基上生成一个共价结合的丙酮酰基团。传统上对修饰丝氨酸位点的检测是通过实验方法进行的，这些方法通常既费力又耗时。在本研究中，我们尝试基于随机森林的特征选择对这些丝氨酸位点进行计算预测。由于在当前版本的蛋白质数据库中仅收集到少量经过实验验证的丙酮酰修饰蛋白，因此在本研究中我们仅使用了一个小数据集。去除序列同一性>60%的蛋白质后，生成并使用了一个非冗余数据集，该数据集仅包含46种蛋白质，每种蛋白质有一个丙酮酰丝氨酸位点。我们的方法考虑了几种类型的特征，包括位置特异性得分矩阵（PSSM）保守分数、无序性、二级结构、溶剂可及性、氨基酸因子和氨基酸出现频率。结果，我们的数据集中取得了相当不错的性能。训练数据集获得了100.00%的最佳准确率和1.0000的马修斯相关系数（MCC）值，测试数据集获得了93.75%的准确率和0.8441的MCC值。最优特征集包含9个特征。对最优特征集的分析表明，一些特定特征在确定丙酮酰基团丝氨酸位点中起着重要作用，这与早期一些实验研究的结果一致。这些选定的特征可能有助于深入理解翻译后自我成熟过程的机制，为实验验证提供指导。随着发现更多的丙酮酰修饰蛋白，应开展进一步的工作，并在更大的数据集上对该方法进行评估。最后，预测软件可从http://www.nkbiox.com/sub/pyrupred/index.html下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b736/3689656/44f849fc3e2d/pone.0066678.g001.jpg

相似文献

Prediction and Analysis of Post-Translational Pyruvoyl Residue Modification Sites from Internal Serines in Proteins.

PLoS One. 2013 Jun 21;8(6):e66678. doi: 10.1371/journal.pone.0066678. Print 2013.

Computational prediction and analysis of protein γ-carboxylation sites based on a random forest method.

Mol Biosyst. 2012 Nov;8(11):2946-55. doi: 10.1039/c2mb25185j. Epub 2012 Aug 23.

Prediction of protein crotonylation sites through LightGBM classifier based on SMOTE and elastic net.

Anal Biochem. 2020 Nov 15;609:113903. doi: 10.1016/j.ab.2020.113903. Epub 2020 Aug 15.

Computational Prediction of Protein Epsilon Lysine Acetylation Sites Based on a Feature Selection Method.

Comb Chem High Throughput Screen. 2017;20(7):629-637. doi: 10.2174/1386207320666170314093216.

DeepPPSite: A deep learning-based model for analysis and prediction of phosphorylation sites using efficient sequence information.

Anal Biochem. 2021 Jan 1;612:113955. doi: 10.1016/j.ab.2020.113955. Epub 2020 Sep 16.

Prediction of serine phosphorylation sites mapping on Schizosaccharomyces Pombe by fusing three encoding schemes with the random forest classifier.

Sci Rep. 2022 Feb 16;12(1):2632. doi: 10.1038/s41598-022-06529-5.

Prediction of lysine ubiquitination with mRMR feature selection and analysis.

Amino Acids. 2012 Apr;42(4):1387-95. doi: 10.1007/s00726-011-0835-0. Epub 2011 Jan 26.

A method to distinguish between lysine acetylation and lysine ubiquitination with feature selection and analysis.

J Biomol Struct Dyn. 2015;33(11):2479-90. doi: 10.1080/07391102.2014.1001793. Epub 2015 Jan 23.

SOHSite: incorporating evolutionary information and physicochemical properties to identify protein S-sulfenylation sites.

BMC Genomics. 2016 Jan 11;17 Suppl 1(Suppl 1):9. doi: 10.1186/s12864-015-2299-1.

PGluS: prediction of protein S-glutathionylation sites with multiple features and analysis.

Mol Biosyst. 2015 Mar;11(3):923-9. doi: 10.1039/c4mb00680a. Epub 2015 Jan 19.

引用本文的文献

Classifying ten types of major cancers based on reverse phase protein array profiles.

PLoS One. 2015 Mar 30;10(3):e0123147. doi: 10.1371/journal.pone.0123147. eCollection 2015.

A least square method based model for identifying protein complexes in protein-protein interaction network.

Biomed Res Int. 2014;2014:720960. doi: 10.1155/2014/720960. Epub 2014 Oct 23.

Discriminating between lysine sumoylation and lysine acetylation using mRMR feature selection and analysis.

PLoS One. 2014 Sep 15;9(9):e107464. doi: 10.1371/journal.pone.0107464. eCollection 2014.

PalmPred: an SVM based palmitoylation prediction method using sequence profile information.

PLoS One. 2014 Feb 19;9(2):e89246. doi: 10.1371/journal.pone.0089246. eCollection 2014.

本文引用的文献

Prediction of protein cleavage site with feature selection by random forest.

PLoS One. 2012;7(9):e45854. doi: 10.1371/journal.pone.0045854. Epub 2012 Sep 18.

Computational prediction and analysis of protein γ-carboxylation sites based on a random forest method.

Mol Biosyst. 2012 Nov;8(11):2946-55. doi: 10.1039/c2mb25185j. Epub 2012 Aug 23.

Prediction of protein domain with mRMR feature selection and analysis.

PLoS One. 2012;7(6):e39308. doi: 10.1371/journal.pone.0039308. Epub 2012 Jun 15.

Structure of Escherichia coli aspartate α-decarboxylase Asn72Ala: probing the role of Asn72 in pyruvoyl cofactor formation.

Acta Crystallogr Sect F Struct Biol Cryst Commun. 2012 Apr 1;68(Pt 4):414-7. doi: 10.1107/S1744309112009487. Epub 2012 Mar 28.

Identification of colorectal cancer related genes with mRMR and shortest path in protein-protein interaction network.

PLoS One. 2012;7(4):e33393. doi: 10.1371/journal.pone.0033393. Epub 2012 Apr 4.

Predict and analyze S-nitrosylation modification sites with the mRMR and IFS approaches.

J Proteomics. 2012 Feb 16;75(5):1654-65. doi: 10.1016/j.jprot.2011.12.003. Epub 2011 Dec 11.

3dswap-pred: prediction of 3D domain swapping from protein sequence using Random Forest approach.

Protein Pept Lett. 2011 Oct;18(10):1010-20. doi: 10.2174/092986611796378729.

Using random forest algorithm to predict β-hairpin motifs.

Protein Pept Lett. 2011 Jun;18(6):609-17. doi: 10.2174/092986611795222777.

HdcB, a novel enzyme catalysing maturation of pyruvoyl-dependent histidine decarboxylase.

Mol Microbiol. 2011 Feb;79(4):861-71. doi: 10.1111/j.1365-2958.2010.07492.x. Epub 2011 Jan 5.

Identification of the primary structure and post-translational modification of rat S-adenosylmethionine decarboxylase.

Biol Pharm Bull. 2010;33(5):891-4. doi: 10.1248/bpb.33.891.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

蛋白质内部丝氨酸翻译后丙酮酰残基修饰位点的预测与分析

Prediction and Analysis of Post-Translational Pyruvoyl Residue Modification Sites from Internal Serines in Proteins.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献