• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于遗传算法和特征选择的蛋白质折叠分类

Protein fold classification with genetic algorithms and feature selection.

作者信息

Chen Peng, Liu Chunmei, Burge Legand, Mahmood Mohammad, Southerland William, Gloster Clay

机构信息

Department of Systems and Computer Science, Howard University, 2300 Sixth Street, NW, Washington, DC 20059, USA.

出版信息

J Bioinform Comput Biol. 2009 Oct;7(5):773-88. doi: 10.1142/s0219720009004321.

DOI:10.1142/s0219720009004321
PMID:19785045
Abstract

Protein fold classification is a key step to predicting protein tertiary structures. This paper proposes a novel approach based on genetic algorithms and feature selection to classifying protein folds. Our dataset is divided into a training dataset and a test dataset. Each individual for the genetic algorithms represents a selection function of the feature vectors of the training dataset. A support vector machine is applied to each individual to evaluate the fitness value (fold classification rate) of each individual. The aim of the genetic algorithms is to search for the best individual that produces the highest fold classification rate. The best individual is then applied to the feature vectors of the test dataset and a support vector machine is built to classify protein folds based on selected features. Our experimental results on Ding and Dubchak's benchmark dataset of 27-class folds show that our approach achieves an accuracy of 71.28%, which outperforms current state-of-the-art protein fold predictors.

摘要

蛋白质折叠分类是预测蛋白质三级结构的关键步骤。本文提出了一种基于遗传算法和特征选择的蛋白质折叠分类新方法。我们的数据集被分为训练数据集和测试数据集。遗传算法的每个个体代表训练数据集特征向量的一个选择函数。将支持向量机应用于每个个体以评估其适应度值(折叠分类率)。遗传算法的目标是搜索产生最高折叠分类率的最佳个体。然后将最佳个体应用于测试数据集的特征向量,并构建支持向量机基于所选特征对蛋白质折叠进行分类。我们在丁和杜布恰克的27类折叠基准数据集上的实验结果表明,我们的方法准确率达到71.28%,优于当前最先进的蛋白质折叠预测器。

相似文献

1
Protein fold classification with genetic algorithms and feature selection.基于遗传算法和特征选择的蛋白质折叠分类
J Bioinform Comput Biol. 2009 Oct;7(5):773-88. doi: 10.1142/s0219720009004321.
2
Predicting protein fold types by the general form of Chou's pseudo amino acid composition: approached from optimal feature extractions.基于周氏伪氨基酸组成的一般形式预测蛋白质折叠类型:从最优特征提取入手
Protein Pept Lett. 2012 Apr;19(4):439-49. doi: 10.2174/092986612799789378.
3
An ensemble approach to protein fold classification by integration of template-based assignment and support vector machine classifier.一种通过整合基于模板的分配和支持向量机分类器进行蛋白质折叠分类的集成方法。
Bioinformatics. 2017 Mar 15;33(6):863-870. doi: 10.1093/bioinformatics/btw768.
4
SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition.支持向量机折叠法:一种用于判别式多类别蛋白质折叠和超家族识别的工具。
BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-8-S4-S2.
5
Recognition of 27-class protein folds by adding the interaction of segments and motif information.通过添加片段相互作用和基序信息来识别27类蛋白质折叠。
Biomed Res Int. 2014;2014:262850. doi: 10.1155/2014/262850. Epub 2014 Jul 21.
6
Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs.基于支持向量机,利用氨基酸残基和氨基酸残基对的结构特性对蛋白质折叠进行分类。
Bioinformatics. 2007 Dec 15;23(24):3320-7. doi: 10.1093/bioinformatics/btm527. Epub 2007 Nov 7.
7
Evidence theoretic protein fold classification based on the concept of hyperfold.基于超折叠概念的证据理论蛋白质折叠分类。
Math Biosci. 2012 Dec;240(2):148-60. doi: 10.1016/j.mbs.2012.07.001. Epub 2012 Jul 20.
8
A machine learning information retrieval approach to protein fold recognition.一种用于蛋白质折叠识别的机器学习信息检索方法。
Bioinformatics. 2006 Jun 15;22(12):1456-63. doi: 10.1093/bioinformatics/btl102. Epub 2006 Mar 17.
9
SeqRate: sequence-based protein folding type classification and rates prediction.SeqRate:基于序列的蛋白质折叠类型分类和速率预测。
BMC Bioinformatics. 2010 Apr 29;11 Suppl 3(Suppl 3):S1. doi: 10.1186/1471-2105-11-S3-S1.
10
Improving Protein Fold Recognition by Deep Learning Networks.通过深度学习网络改进蛋白质折叠识别
Sci Rep. 2015 Dec 4;5:17573. doi: 10.1038/srep17573.

引用本文的文献

1
Sequence-Based Prediction of Plant Allergenic Proteins: Machine Learning Classification Approach.基于序列的植物变应原蛋白预测:机器学习分类方法
ACS Omega. 2023 Jan 20;8(4):3698-3704. doi: 10.1021/acsomega.2c02842. eCollection 2023 Jan 31.