使用支持向量机预测蛋白质结构类别。

Prediction of protein structural classes using support vector machines.

作者信息

Sun X-D, Huang R-B

机构信息

College of Life Science and Biotechnology, Guangxi University, Nanning, Guangxi, China.

出版信息

Amino Acids. 2006 Jun;30(4):469-75. doi: 10.1007/s00726-005-0239-0. Epub 2006 Apr 20.

DOI:10.1007/s00726-005-0239-0

PMID:16622605

Abstract

The support vector machine, a machine-learning method, is used to predict the four structural classes, i.e. mainly alpha, mainly beta, alpha-beta and fss, from the topology-level of CATH protein structure database. For the binary classification, any two structural classes which do not share any secondary structure such as alpha and beta elements could be classified with as high as 90% accuracy. The accuracy, however, will decrease to less than 70% if the structural classes to be classified contain structure elements in common. Our study also shows that the dimensions of feature space 20(2) = 400 (for dipeptide) and 20(3) = 8 000 (for tripeptide) give nearly the same prediction accuracy. Among these 4 structural classes, multi-class classification gives an overall accuracy of about 52%, indicating that the multi-class classification technique in support of vector machines may still need to be further improved in future investigation.

摘要

支持向量机作为一种机器学习方法，用于从CATH蛋白质结构数据库的拓扑层面预测四种结构类别，即主要为α结构、主要为β结构、α-β结构和fss结构。对于二元分类，任何两个不共享任何二级结构（如α和β元件）的结构类别分类准确率可达90%。然而，如果要分类的结构类别包含共同的结构元件，准确率将降至70%以下。我们的研究还表明，特征空间维度20(2)=400（用于二肽）和20(3)=8000（用于三肽）给出的预测准确率几乎相同。在这4种结构类别中，多类别分类的总体准确率约为52%，这表明支持向量机中的多类别分类技术在未来研究中可能仍需进一步改进。

相似文献

Prediction of protein structural classes using support vector machines.使用支持向量机预测蛋白质结构类别。

Amino Acids. 2006 Jun;30(4):469-75. doi: 10.1007/s00726-005-0239-0. Epub 2006 Apr 20.

Protein topology classification using two-stage support vector machines.使用两阶段支持向量机的蛋白质拓扑结构分类

Genome Inform. 2006;17(2):259-69.

Prediction of protein subcellular localization by support vector machines using multi-scale energy and pseudo amino acid composition.利用多尺度能量和伪氨基酸组成的支持向量机预测蛋白质亚细胞定位

Amino Acids. 2007 Jul;33(1):69-74. doi: 10.1007/s00726-006-0475-y. Epub 2007 Jan 19.

Using support vector machines for prediction of protein structural classes based on discrete wavelet transform.基于离散小波变换，使用支持向量机预测蛋白质结构类别。

J Comput Chem. 2009 Jun;30(8):1344-50. doi: 10.1002/jcc.21115.

Protein topology recognition from secondary structure sequences: application of the hidden Markov models to the alpha class proteins.从二级结构序列识别蛋白质拓扑结构：隐马尔可夫模型在α类蛋白质中的应用。

J Mol Biol. 1997 Mar 28;267(2):446-63. doi: 10.1006/jmbi.1996.0874.

Prediction of protein homo-oligomer types by pseudo amino acid composition: Approached with an improved feature extraction and Naive Bayes Feature Fusion.基于伪氨基酸组成预测蛋白质同源寡聚体类型：采用改进的特征提取和朴素贝叶斯特征融合方法

Amino Acids. 2006 Jun;30(4):461-8. doi: 10.1007/s00726-006-0263-8. Epub 2006 May 15.

LOCUSTRA: accurate prediction of local protein structure using a two-layer support vector machine approach.LOCUSTRA：使用双层支持向量机方法准确预测局部蛋白质结构。

J Chem Inf Model. 2008 Sep;48(9):1903-8. doi: 10.1021/ci800178a. Epub 2008 Sep 3.

Multi-class support vector machines for protein secondary structure prediction.用于蛋白质二级结构预测的多类支持向量机

Genome Inform. 2003;14:218-27.

Prediction of protein structural classes.蛋白质结构类别的预测。

Crit Rev Biochem Mol Biol. 1995;30(4):275-349. doi: 10.3109/10409239509083488.

Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs.基于支持向量机，利用氨基酸残基和氨基酸残基对的结构特性对蛋白质折叠进行分类。

Bioinformatics. 2007 Dec 15;23(24):3320-7. doi: 10.1093/bioinformatics/btm527. Epub 2007 Nov 7.

引用本文的文献

Revolutionizing oncology: the role of Artificial Intelligence (AI) as an antibody design, and optimization tools.肿瘤学的变革：人工智能（AI）作为抗体设计与优化工具的作用。

Biomark Res. 2025 Mar 29;13(1):52. doi: 10.1186/s40364-025-00764-4.

SERT-StructNet: Protein secondary structure prediction method based on multi-factor hybrid deep model.SERT-StructNet：基于多因素混合深度模型的蛋白质二级结构预测方法。

Comput Struct Biotechnol J. 2024 Mar 22;23:1364-1375. doi: 10.1016/j.csbj.2024.03.018. eCollection 2024 Dec.

Comparative Study on Feature Selection in Protein Structure and Function Prediction.蛋白质结构与功能预测中的特征选择比较研究。

Comput Math Methods Med. 2022 Oct 11;2022:1650693. doi: 10.1155/2022/1650693. eCollection 2022.

Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences.使用递归特征选择和随机森林提高低相似度序列的蛋白质结构分类预测。

Comput Math Methods Med. 2021 May 7;2021:5529389. doi: 10.1155/2021/5529389. eCollection 2021.

Prediction of protein structural classes by different feature expressions based on 2-D wavelet denoising and fusion.基于二维小波去噪和融合的不同特征表达预测蛋白质结构类别。

BMC Bioinformatics. 2019 Dec 24;20(Suppl 25):701. doi: 10.1186/s12859-019-3276-5.

Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM.基于一致序列和分段位置特异性得分矩阵预测低相似性序列的蛋白质结构类别

Comput Math Methods Med. 2015;2015:370756. doi: 10.1155/2015/370756. Epub 2015 Dec 15.

Comparison study on statistical features of predicted secondary structures for protein structural class prediction: From content to position.基于内容与位置的预测二级结构统计特征在蛋白质结构类别预测中的比较研究

BMC Bioinformatics. 2013 May 4;14:152. doi: 10.1186/1471-2105-14-152.

Predicting chemical toxicity effects based on chemical-chemical interactions.基于化学-化学相互作用预测化学毒性效应。

PLoS One. 2013;8(2):e56517. doi: 10.1371/journal.pone.0056517. Epub 2013 Feb 15.

Predicting metabolic pathways of small molecules and enzymes based on interaction information of chemicals and proteins.基于化学物质和蛋白质相互作用信息预测小分子和酶的代谢途径。

PLoS One. 2012;7(9):e45944. doi: 10.1371/journal.pone.0045944. Epub 2012 Sep 21.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用支持向量机预测蛋白质结构类别。

Prediction of protein structural classes using support vector machines.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献