基于多特征向量和半胱氨酸状态序列，使用支持向量机预测半胱氨酸的结合状态。

Prediction of the bonding states of cysteines using the support vector machines based on multiple feature vectors and cysteine state sequences.

作者信息

Chen Yu-Ching, Lin Yeong-Shin, Lin Chih-Jen, Hwang Jenn-Kang

机构信息

Institute of Bioinformatics, National Chiao Tung University, HsinChu, Taiwan, ROC.

出版信息

Proteins. 2004 Jun 1;55(4):1036-42. doi: 10.1002/prot.20079.

DOI:10.1002/prot.20079

PMID:15146500

Abstract

The support vector machine (SVM) method is used to predict the bonding states of cysteines. Besides using local descriptors such as the local sequences, we include global information, such as amino acid compositions and the patterns of the states of cysteines (bonded or nonbonded), or cysteine state sequences, of the proteins. We found that SVM based on local sequences or global amino acid compositions yielded similar prediction accuracies for the data set comprising 4136 cysteine-containing segments extracted from 969 nonhomologous proteins. However, the SVM method based on multiple feature vectors (combining local sequences and global amino acid compositions) significantly improves the prediction accuracy, from 80% to 86%. If coupled with cysteine state sequences, SVM based on multiple feature vectors yields 90% in overall prediction accuracy and a 0.77 Matthews correlation coefficient, around 10% and 22% higher than the corresponding values obtained by SVM based on local sequence information.

摘要

支持向量机（SVM）方法用于预测半胱氨酸的结合状态。除了使用局部描述符（如局部序列）外，我们还纳入了全局信息，如蛋白质的氨基酸组成以及半胱氨酸的状态模式（结合或未结合），即半胱氨酸状态序列。我们发现，基于局部序列或全局氨基酸组成的支持向量机，对于从969个非同源蛋白质中提取的4136个含半胱氨酸片段的数据集，产生了相似的预测准确率。然而，基于多个特征向量（结合局部序列和全局氨基酸组成）的支持向量机方法显著提高了预测准确率，从80%提高到86%。如果与半胱氨酸状态序列相结合，基于多个特征向量的支持向量机在总体预测准确率上达到90%，马修斯相关系数为0.77，分别比基于局部序列信息的支持向量机获得的相应值高出约10%和22%。

相似文献

Prediction of the bonding states of cysteines using the support vector machines based on multiple feature vectors and cysteine state sequences.

Proteins. 2004 Jun 1;55(4):1036-42. doi: 10.1002/prot.20079.

Predicting the state of cysteines based on sequence information.

J Theor Biol. 2010 Dec 7;267(3):312-8. doi: 10.1016/j.jtbi.2010.09.002. Epub 2010 Sep 6.

Cooperativity of the oxidization of cysteines in globular proteins.

J Theor Biol. 2004 Nov 7;231(1):85-95. doi: 10.1016/j.jtbi.2004.06.002.

Identifying cysteines and histidines in transition-metal-binding sites using support vector machines and neural networks.

Proteins. 2006 Nov 1;65(2):305-16. doi: 10.1002/prot.21135.

Prediction of disulfide connectivity from protein sequences.

Proteins. 2005 Nov 15;61(3):507-12. doi: 10.1002/prot.20627.

Analysis of factors that induce cysteine bonding state.

Comput Biol Med. 2009 Apr;39(4):332-9. doi: 10.1016/j.compbiomed.2009.01.006. Epub 2009 Feb 25.

Prediction of protein-protein interaction sites using support vector machines.

Protein Eng Des Sel. 2004 Feb;17(2):165-73. doi: 10.1093/protein/gzh020. Epub 2004 Jan 20.

Prediction of protein subcellular localization.

Proteins. 2006 Aug 15;64(3):643-51. doi: 10.1002/prot.21018.

Remote homolog detection using local sequence-structure correlations.

Proteins. 2004 Nov 15;57(3):518-30. doi: 10.1002/prot.20221.

Disulfide connectivity prediction based on structural information without a prior knowledge of the bonding state of cysteines.

Comput Biol Med. 2013 Nov;43(11):1941-8. doi: 10.1016/j.compbiomed.2013.09.008. Epub 2013 Sep 18.

引用本文的文献

Predicting Anticancer Drug Resistance Mediated by Mutations.

Pharmaceuticals (Basel). 2022 Jan 24;15(2):136. doi: 10.3390/ph15020136.

The structure-based cancer-related single amino acid variation prediction.

Sci Rep. 2021 Jun 30;11(1):13599. doi: 10.1038/s41598-021-92793-w.

ProLanGO: Protein Function Prediction Using Neural Machine Translation Based on a Recurrent Neural Network.

Molecules. 2017 Oct 17;22(10):1732. doi: 10.3390/molecules22101732.

Accurate disulfide-bonding network predictions improve ab initio structure prediction of cysteine-rich proteins.

Bioinformatics. 2015 Dec 1;31(23):3773-81. doi: 10.1093/bioinformatics/btv459. Epub 2015 Aug 7.

On the structural context and identification of enzyme catalytic residues.

Biomed Res Int. 2013;2013:802945. doi: 10.1155/2013/802945. Epub 2013 Feb 3.

Prediction of disulfide connectivity in proteins with machine-learning methods and correlated mutations.

BMC Bioinformatics. 2013;14 Suppl 1(Suppl 1):S10. doi: 10.1186/1471-2105-14-S1-S10. Epub 2013 Jan 14.

Protein disulfide topology determination through the fusion of mass spectrometric analysis and sequence-based prediction using Dempster-Shafer theory.

BMC Bioinformatics. 2013;14 Suppl 2(Suppl 2):S20. doi: 10.1186/1471-2105-14-S2-S20. Epub 2013 Jan 21.

CMD: A Database to Store the Bonding States of Cysteine Motifs with Secondary Structures.

Adv Bioinformatics. 2012;2012:849830. doi: 10.1155/2012/849830. Epub 2012 Oct 10.

Analysis and functional prediction of reactive cysteine residues.

J Biol Chem. 2012 Feb 10;287(7):4419-25. doi: 10.1074/jbc.R111.275578. Epub 2011 Dec 6.

Redox biology: computational approaches to the investigation of functional cysteine residues.

Antioxid Redox Signal. 2011 Jul 1;15(1):135-46. doi: 10.1089/ars.2010.3561. Epub 2011 Apr 14.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于多特征向量和半胱氨酸状态序列，使用支持向量机预测半胱氨酸的结合状态。

Prediction of the bonding states of cysteines using the support vector machines based on multiple feature vectors and cysteine state sequences.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献