• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过添加片段相互作用和基序信息来识别27类蛋白质折叠。

Recognition of 27-class protein folds by adding the interaction of segments and motif information.

作者信息

Feng Zhenxing, Hu Xiuzhen

机构信息

Department of Sciences, Inner Mongolia University of Technology, Hohhot, China.

出版信息

Biomed Res Int. 2014;2014:262850. doi: 10.1155/2014/262850. Epub 2014 Jul 21.

DOI:10.1155/2014/262850
PMID:25136571
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4127253/
Abstract

The recognition of protein folds is an important step for the prediction of protein structure and function. After the recognition of 27-class protein folds in 2001 by Ding and Dubchak, prediction algorithms, prediction parameters, and new datasets for the prediction of protein folds have been improved. However, the influences of interactions from predicted secondary structure segments and motif information on protein folding have not been considered. Therefore, the recognition of 27-class protein folds with the interaction of segments and motif information is very important. Based on the 27-class folds dataset built by Liu et al., amino acid composition, the interactions of secondary structure segments, motif frequency, and predicted secondary structure information were extracted. Using the Random Forest algorithm and the ensemble classification strategy, 27-class protein folds and corresponding structural classification were identified by independent test. The overall accuracy of the testing set and structural classification measured up to 78.38% and 92.55%, respectively. When the training set and testing set were combined, the overall accuracy by 5-fold cross validation was 81.16%. In order to compare with the results of previous researchers, the method above was tested on Ding and Dubchak's dataset which has been widely used by many previous researchers, and an improved overall accuracy 70.24% was obtained.

摘要

蛋白质折叠的识别是预测蛋白质结构和功能的重要一步。2001年丁和杜布恰克识别出27类蛋白质折叠后,用于蛋白质折叠预测的算法、预测参数和新数据集都得到了改进。然而,预测的二级结构片段之间的相互作用以及基序信息对蛋白质折叠的影响尚未得到考虑。因此,考虑片段相互作用和基序信息来识别27类蛋白质折叠非常重要。基于刘等人构建的27类折叠数据集,提取了氨基酸组成、二级结构片段的相互作用、基序频率和预测的二级结构信息。使用随机森林算法和集成分类策略,通过独立测试识别出27类蛋白质折叠及其相应的结构分类。测试集的总体准确率和结构分类分别达到78.38%和92.55%。当训练集和测试集合并时,5折交叉验证的总体准确率为81.16%。为了与之前研究人员的结果进行比较,在丁和杜布恰克的数据集上测试了上述方法,该数据集已被许多之前的研究人员广泛使用,并获得了70.24%的改进总体准确率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/012d/4127253/4cff7165d803/BMRI2014-262850.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/012d/4127253/4c1397eefdd8/BMRI2014-262850.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/012d/4127253/4cff7165d803/BMRI2014-262850.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/012d/4127253/4c1397eefdd8/BMRI2014-262850.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/012d/4127253/4cff7165d803/BMRI2014-262850.002.jpg

相似文献

1
Recognition of 27-class protein folds by adding the interaction of segments and motif information.通过添加片段相互作用和基序信息来识别27类蛋白质折叠。
Biomed Res Int. 2014;2014:262850. doi: 10.1155/2014/262850. Epub 2014 Jul 21.
2
The recognition of multi-class protein folds by adding average chemical shifts of secondary structure elements.通过添加二级结构元件的平均化学位移来识别多类蛋白质折叠。
Saudi J Biol Sci. 2016 Mar;23(2):189-97. doi: 10.1016/j.sjbs.2015.10.008. Epub 2015 Dec 11.
3
Predicting protein fold types by the general form of Chou's pseudo amino acid composition: approached from optimal feature extractions.基于周氏伪氨基酸组成的一般形式预测蛋白质折叠类型:从最优特征提取入手
Protein Pept Lett. 2012 Apr;19(4):439-49. doi: 10.2174/092986612799789378.
4
Protein fold classification with genetic algorithms and feature selection.基于遗传算法和特征选择的蛋白质折叠分类
J Bioinform Comput Biol. 2009 Oct;7(5):773-88. doi: 10.1142/s0219720009004321.
5
A two-layer classification framework for protein fold recognition.用于蛋白质折叠识别的两层分类框架。
J Theor Biol. 2015 Jan 21;365:32-9. doi: 10.1016/j.jtbi.2014.09.032. Epub 2014 Sep 30.
6
The recognition of 27-class protein folds: approached by increment of diversity based on multi-characteristic parameters.基于多特征参数的多样性增量法对27类蛋白质折叠的识别
Protein Pept Lett. 2009;16(9):1112-9. doi: 10.2174/092986609789055278.
7
Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs.基于支持向量机,利用氨基酸残基和氨基酸残基对的结构特性对蛋白质折叠进行分类。
Bioinformatics. 2007 Dec 15;23(24):3320-7. doi: 10.1093/bioinformatics/btm527. Epub 2007 Nov 7.
8
SCPRED: accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences.SCPRED:对与预测序列具有模糊相似性的序列的蛋白质结构类别进行准确预测。
BMC Bioinformatics. 2008 May 1;9:226. doi: 10.1186/1471-2105-9-226.
9
FRAN and RBF-PSO as two components of a hyper framework to recognize protein folds.FRAN 和 RBF-PSO 作为超框架的两个组成部分,用于识别蛋白质折叠。
Comput Biol Med. 2013 Sep;43(9):1182-91. doi: 10.1016/j.compbiomed.2013.05.017. Epub 2013 Jun 3.
10
Improving protein fold recognition using the amalgamation of evolutionary-based and structural based information.利用基于进化和基于结构的信息融合来改进蛋白质折叠识别。
BMC Bioinformatics. 2014;15 Suppl 16(Suppl 16):S12. doi: 10.1186/1471-2105-15-S16-S12. Epub 2014 Dec 8.

引用本文的文献

1
DeepFrag-k: a fragment-based deep learning approach for protein fold recognition.DeepFrag-k:一种用于蛋白质折叠识别的基于片段的深度学习方法。
BMC Bioinformatics. 2020 Nov 18;21(Suppl 6):203. doi: 10.1186/s12859-020-3504-z.
2
Recognizing Ion Ligand-Binding Residues by Random Forest Algorithm Based on Optimized Dihedral Angle.基于优化二面角的随机森林算法识别离子配体结合残基
Front Bioeng Biotechnol. 2020 Jun 12;8:493. doi: 10.3389/fbioe.2020.00493. eCollection 2020.
3
Recognizing ion ligand binding sites by SMO algorithm.通过 SMO 算法识别离子配体结合位点。

本文引用的文献

1
Predicting enzyme subclasses by using random forest with multicharacteristic parameters.使用具有多特征参数的随机森林预测酶亚类
Protein Pept Lett. 2014 Mar;21(3):275-84. doi: 10.2174/09298665113206660114.
2
Hierarchical classification of protein folds using a novel ensemble classifier.利用新型集成分类器对蛋白质折叠进行层次分类。
PLoS One. 2013;8(2):e56499. doi: 10.1371/journal.pone.0056499. Epub 2013 Feb 20.
3
Predicting protein fold types by the general form of Chou's pseudo amino acid composition: approached from optimal feature extractions.
BMC Mol Cell Biol. 2019 Dec 11;20(Suppl 3):53. doi: 10.1186/s12860-019-0237-9.
4
Prediction of acid radical ion binding residues by K-nearest neighbors classifier.基于 K-最近邻分类器预测酸根离子结合残基。
BMC Mol Cell Biol. 2019 Dec 11;20(Suppl 3):52. doi: 10.1186/s12860-019-0238-8.
5
Identification of metal ion binding sites based on amino acid sequences.基于氨基酸序列鉴定金属离子结合位点。
PLoS One. 2017 Aug 30;12(8):e0183756. doi: 10.1371/journal.pone.0183756. eCollection 2017.
6
Recent Progress in Machine Learning-Based Methods for Protein Fold Recognition.基于机器学习的蛋白质折叠识别方法的最新进展
Int J Mol Sci. 2016 Dec 16;17(12):2118. doi: 10.3390/ijms17122118.
7
ProFold: Protein Fold Classification with Additional Structural Features and a Novel Ensemble Classifier.ProFold:结合额外结构特征与新型集成分类器的蛋白质折叠分类
Biomed Res Int. 2016;2016:6802832. doi: 10.1155/2016/6802832. Epub 2016 Aug 28.
8
Characterization and Prediction of Protein Flexibility Based on Structural Alphabets.基于结构字母表的蛋白质柔性表征与预测
Biomed Res Int. 2016;2016:4628025. doi: 10.1155/2016/4628025. Epub 2016 Aug 30.
基于周氏伪氨基酸组成的一般形式预测蛋白质折叠类型:从最优特征提取入手
Protein Pept Lett. 2012 Apr;19(4):439-49. doi: 10.2174/092986612799789378.
4
Using random forest algorithm to predict β-hairpin motifs.使用随机森林算法预测β-发夹基序。
Protein Pept Lett. 2011 Jun;18(6):609-17. doi: 10.2174/092986611795222777.
5
AFP-Pred: A random forest approach for predicting antifreeze proteins from sequence-derived properties.AFP-Pred:一种基于序列衍生特性预测抗冻蛋白的随机森林方法。
J Theor Biol. 2011 Feb 7;270(1):56-62. doi: 10.1016/j.jtbi.2010.10.037. Epub 2010 Nov 4.
6
Prediction of beta-hairpins in proteins using physicochemical properties and structure information.利用物理化学性质和结构信息预测蛋白质中的β-发夹结构。
Protein Pept Lett. 2010 Sep;17(9):1123-8. doi: 10.2174/092986610791760333.
7
The recognition of 27-class protein folds: approached by increment of diversity based on multi-characteristic parameters.基于多特征参数的多样性增量法对27类蛋白质折叠的识别
Protein Pept Lett. 2009;16(9):1112-9. doi: 10.2174/092986609789055278.
8
A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation.基于自互协方差变换的新分类学蛋白质折叠识别方法。
Bioinformatics. 2009 Oct 15;25(20):2655-62. doi: 10.1093/bioinformatics/btp500. Epub 2009 Aug 25.
9
Prediction of DNA-binding residues from protein sequence information using random forests.利用随机森林从蛋白质序列信息预测DNA结合残基。
BMC Genomics. 2009 Jul 7;10 Suppl 1(Suppl 1):S1. doi: 10.1186/1471-2164-10-S1-S1.
10
MEME SUITE: tools for motif discovery and searching.MEME套件:用于基序发现和搜索的工具。
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W202-8. doi: 10.1093/nar/gkp335. Epub 2009 May 20.