对处于模糊界限区域的序列进行蛋白质结构类别的预测。

Prediction of protein structural class for the twilight zone sequences.

作者信息

Kurgan Lukasz, Chen Ke

机构信息

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada.

出版信息

Biochem Biophys Res Commun. 2007 Jun 1;357(2):453-60. doi: 10.1016/j.bbrc.2007.03.164. Epub 2007 Apr 5.

DOI:10.1016/j.bbrc.2007.03.164

PMID:17433260

Abstract

Structural class characterizes the overall folding type of a protein or its domain. This paper develops an accurate method for in silico prediction of structural classes from low homology (twilight zone) protein sequences. The proposed LLSC-PRED method applies linear logistic regression classifier and a custom-designed, feature-based sequence representation to provide predictions. The main advantages of the LLSC-PRED are the comprehensive representation that includes 58 features describing composition and physicochemical properties of the sequences and transparency of the prediction model. The representation also includes predicted secondary structure content, thus for the first time exploring synergy between these two related predictions. Based on tests performed with a large set of 1673 twilight zone domains, the LLSC-PRED's prediction accuracy, which equals over 62%, is shown to be better than accuracy of over a dozen recently published competing in silico methods and similar to accuracy of other, non-transparent classifiers that use the proposed representation.

摘要

结构类别表征蛋白质或其结构域的整体折叠类型。本文开发了一种从低同源性（模糊区）蛋白质序列进行结构类别计算机预测的精确方法。所提出的LLSC - PRED方法应用线性逻辑回归分类器和定制设计的基于特征的序列表示来进行预测。LLSC - PRED的主要优点是包含58个描述序列组成和物理化学性质的特征的综合表示以及预测模型的透明度。该表示还包括预测的二级结构含量，从而首次探索这两个相关预测之间的协同作用。基于对1673个模糊区结构域的大量测试，LLSC - PRED的预测准确率超过62%，结果表明其优于最近发表的十几种竞争的计算机方法的准确率，并且与使用所提出表示的其他非透明分类器的准确率相似。

相似文献

Prediction of protein structural class for the twilight zone sequences.对处于模糊界限区域的序列进行蛋白质结构类别的预测。

Biochem Biophys Res Commun. 2007 Jun 1;357(2):453-60. doi: 10.1016/j.bbrc.2007.03.164. Epub 2007 Apr 5.

PFRES: protein fold classification by using evolutionary information and predicted secondary structure.PFRES：利用进化信息和预测的二级结构进行蛋白质折叠分类

Bioinformatics. 2007 Nov 1;23(21):2843-50. doi: 10.1093/bioinformatics/btm475. Epub 2007 Oct 17.

Prediction of protein secondary structure content for the twilight zone sequences.预测处于模糊区域序列的蛋白质二级结构含量。

Proteins. 2007 Nov 15;69(3):486-98. doi: 10.1002/prot.21527.

Prediction of protein structural class using novel evolutionary collocation-based sequence representation.使用基于新型进化搭配的序列表示法预测蛋白质结构类别。

J Comput Chem. 2008 Jul 30;29(10):1596-604. doi: 10.1002/jcc.20918.

SCPRED: accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences.SCPRED：对与预测序列具有模糊相似性的序列的蛋白质结构类别进行准确预测。

BMC Bioinformatics. 2008 May 1;9:226. doi: 10.1186/1471-2105-9-226.

Prediction of protein structural class with Rough Sets.基于粗糙集的蛋白质结构类预测

BMC Bioinformatics. 2006 Jan 14;7:20. doi: 10.1186/1471-2105-7-20.

A new representation for protein secondary structure prediction based on frequent patterns.一种基于频繁模式的蛋白质二级结构预测新表示法。

Bioinformatics. 2006 Nov 1;22(21):2628-34. doi: 10.1093/bioinformatics/btl453. Epub 2006 Aug 29.

Classifier ensembles for protein structural class prediction with varying homology.用于具有不同同源性的蛋白质结构类别预测的分类器集成

Biochem Biophys Res Commun. 2006 Sep 29;348(3):981-8. doi: 10.1016/j.bbrc.2006.07.141. Epub 2006 Jul 31.

Predicting disulfide connectivity from protein sequence using multiple sequence feature vectors and secondary structure.使用多序列特征向量和二级结构从蛋白质序列预测二硫键连接性。

Bioinformatics. 2007 Dec 1;23(23):3147-54. doi: 10.1093/bioinformatics/btm505. Epub 2007 Oct 17.

Protein superfamily classification using fuzzy rule-based classifier.使用基于模糊规则的分类器进行蛋白质超家族分类。

IEEE Trans Nanobioscience. 2009 Mar;8(1):92-9. doi: 10.1109/TNB.2009.2016484. Epub 2009 Mar 21.

引用本文的文献

Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences.使用递归特征选择和随机森林提高低相似度序列的蛋白质结构分类预测。

Comput Math Methods Med. 2021 May 7;2021:5529389. doi: 10.1155/2021/5529389. eCollection 2021.

Prediction of Protein Structural Class Based on Gapped-Dipeptides and a Recursive Feature Selection Approach.基于带间隙二肽和递归特征选择方法的蛋白质结构类预测

Int J Mol Sci. 2015 Dec 24;17(1):15. doi: 10.3390/ijms17010015.

General overview on structure prediction of twilight-zone proteins.关于暗区蛋白结构预测的概述

Theor Biol Med Model. 2015 Sep 4;12:15. doi: 10.1186/s12976-015-0014-1.

Customised fragments libraries for protein structure prediction based on structural class annotations.基于结构类注释的用于蛋白质结构预测的定制片段文库。

BMC Bioinformatics. 2015 Apr 29;16(1):136. doi: 10.1186/s12859-015-0576-2.

Novel numerical characterization of protein sequences based on individual amino acid and its application.基于单个氨基酸的蛋白质序列新型数值表征及其应用

Biomed Res Int. 2015;2015:909567. doi: 10.1155/2015/909567. Epub 2015 Feb 2.

Quad-PRE: a hybrid method to predict protein quaternary structure attributes.Quad-PRE：一种预测蛋白质四级结构属性的混合方法。

Comput Math Methods Med. 2014;2014:715494. doi: 10.1155/2014/715494. Epub 2014 May 18.

PSSP-RFE: accurate prediction of protein structural class by recursive feature extraction from PSI-BLAST profile, physical-chemical property and functional annotations.PSSP-RFE：通过从PSI-BLAST序列谱、物理化学性质和功能注释中进行递归特征提取来准确预测蛋白质结构类别。

PLoS One. 2014 Mar 27;9(3):e92863. doi: 10.1371/journal.pone.0092863. eCollection 2014.

Proposing a highly accurate protein structural class predictor using segmentation-based features.提出一种基于分段特征的高精度蛋白质结构类预测器。

BMC Genomics. 2014;15 Suppl 1(Suppl 1):S2. doi: 10.1186/1471-2164-15-S1-S2. Epub 2014 Jan 24.

A strategy to select suitable physicochemical attributes of amino acids for protein fold recognition.氨基酸理化属性选择用于蛋白质折叠识别的策略。

BMC Bioinformatics. 2013 Jul 24;14:233. doi: 10.1186/1471-2105-14-233.

Comparison study on statistical features of predicted secondary structures for protein structural class prediction: From content to position.基于内容与位置的预测二级结构统计特征在蛋白质结构类别预测中的比较研究

BMC Bioinformatics. 2013 May 4;14:152. doi: 10.1186/1471-2105-14-152.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

对处于模糊界限区域的序列进行蛋白质结构类别的预测。

Prediction of protein structural class for the twilight zone sequences.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献