使用结构字母表增强蛋白质折叠识别

Enhanced protein fold recognition using a structural alphabet.

作者信息

Deschavanne Patrick, Tufféry Pierre

机构信息

Equipe de Bioinformatique Génomique et Moléculaire, INSERM UMR-S 726, Université Paris Diderot-Paris 7, F75013, Paris, France.

出版信息

Proteins. 2009 Jul;76(1):129-37. doi: 10.1002/prot.22324.

DOI:10.1002/prot.22324

PMID:19089985

Abstract

Fold recognition from sequence can be an important step in protein structure and function prediction. Many methods have tackled this goal. Most of them, based on sequence alignment, fail for sequences of low similarity. Alignment-free approaches can provide an efficient alternative. For such approaches, the identification of efficient fold discriminatory features is critical. We propose a new fold recognition approach that relies on the encoding of the local structure of proteins using a Hidden Markov Model Structural Alphabet. This encoding provides a 1D description of the conformation of complete proteins structures, including loops. At the fold level, compared with the classical secondary structure helix, strand, and coil states, such encoding is expected to provide the means of a better discrimination between loop conformations, hence providing better fold identification. Compared with previous related approaches, this supplement of information results in significant improvement. When combining this information with supplementary information of secondary structure and residue burial, we obtain a fold recognition accuracy of 78% for 27 protein families, that is, 8% higher than the best available method so far, and of 68% for 60 families. Corresponding scores at the class level are of 92% and 90% indicating that mispredictions are mostly within structural classes.

摘要

从序列进行折叠识别可能是蛋白质结构和功能预测中的重要一步。许多方法都致力于实现这一目标。其中大多数基于序列比对的方法，对于低相似性序列会失效。无比对方法可以提供一种有效的替代方案。对于此类方法，识别有效的折叠区分特征至关重要。我们提出了一种新的折叠识别方法，该方法依赖于使用隐马尔可夫模型结构字母表对蛋白质的局部结构进行编码。这种编码提供了完整蛋白质结构构象的一维描述，包括环。在折叠水平上，与经典的二级结构螺旋、链和卷曲状态相比，这种编码有望提供更好地区分环构象的方法，从而实现更好的折叠识别。与先前的相关方法相比，这种信息补充带来了显著的改进。当将此信息与二级结构和残基埋藏的补充信息相结合时，对于27个蛋白质家族，我们获得了78%的折叠识别准确率，即比目前最好的可用方法高出8%，对于60个家族则为68%。在类别水平上的相应分数分别为92%和90%，这表明错误预测大多发生在结构类别内。

相似文献

Enhanced protein fold recognition using a structural alphabet.使用结构字母表增强蛋白质折叠识别

Proteins. 2009 Jul;76(1):129-37. doi: 10.1002/prot.22324.

Hidden Markov models that use predicted local structure for fold recognition: alphabets of backbone geometry.利用预测的局部结构进行折叠识别的隐马尔可夫模型：主链几何结构字母表

Proteins. 2003 Jun 1;51(4):504-14. doi: 10.1002/prot.10369.

A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence.一种用于蛋白质折叠识别的3D-1D替换矩阵，其包含序列的预测二级结构。

J Mol Biol. 1997 Apr 11;267(4):1026-38. doi: 10.1006/jmbi.1997.0924.

Combining local-structure, fold-recognition, and new fold methods for protein structure prediction.结合局部结构、折叠识别和新的折叠方法进行蛋白质结构预测。

Proteins. 2003;53 Suppl 6:491-6. doi: 10.1002/prot.10540.

Sequence comparison and protein structure prediction.序列比较与蛋白质结构预测。

Curr Opin Struct Biol. 2006 Jun;16(3):374-84. doi: 10.1016/j.sbi.2006.05.006. Epub 2006 May 19.

Sequence-based protein structure prediction using a reduced state-space hidden Markov model.使用简化状态空间隐马尔可夫模型进行基于序列的蛋白质结构预测。

Comput Biol Med. 2007 Sep;37(9):1211-24. doi: 10.1016/j.compbiomed.2006.10.014. Epub 2006 Dec 11.

Protein topology recognition from secondary structure sequences: application of the hidden Markov models to the alpha class proteins.从二级结构序列识别蛋白质拓扑结构：隐马尔可夫模型在α类蛋白质中的应用。

J Mol Biol. 1997 Mar 28;267(2):446-63. doi: 10.1006/jmbi.1996.0874.

Protein structure prediction of CASP5 comparative modeling and fold recognition targets using consensus alignment approach and 3D assessment.使用一致性比对方法和三维评估对CASP5比较建模与折叠识别目标进行蛋白质结构预测。

Proteins. 2003;53 Suppl 6:410-7. doi: 10.1002/prot.10548.

Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs.基于支持向量机，利用氨基酸残基和氨基酸残基对的结构特性对蛋白质折叠进行分类。

Bioinformatics. 2007 Dec 15;23(24):3320-7. doi: 10.1093/bioinformatics/btm527. Epub 2007 Nov 7.

HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins.HMMSTR：一种用于蛋白质局部序列-结构相关性的隐马尔可夫模型。

J Mol Biol. 2000 Aug 4;301(1):173-90. doi: 10.1006/jmbi.2000.3837.

引用本文的文献

SAFlex: A structural alphabet extension to integrate protein structural flexibility and missing data information.SAFlex：一种结构字母扩展，用于整合蛋白质结构的灵活性和缺失数据信息。

PLoS One. 2018 Jul 5;13(7):e0198854. doi: 10.1371/journal.pone.0198854. eCollection 2018.

Exploring the potential of a structural alphabet-based tool for mining multiple target conformations and target flexibility insight.探索一种基于结构字母表的工具挖掘多个目标构象及洞察目标灵活性的潜力。

PLoS One. 2017 Aug 17;12(8):e0182972. doi: 10.1371/journal.pone.0182972. eCollection 2017.

ProFold: Protein Fold Classification with Additional Structural Features and a Novel Ensemble Classifier.ProFold：结合额外结构特征与新型集成分类器的蛋白质折叠分类

Biomed Res Int. 2016;2016:6802832. doi: 10.1155/2016/6802832. Epub 2016 Aug 28.

Using Local States To Drive the Sampling of Global Conformations in Proteins.利用局部状态驱动蛋白质全局构象的采样

J Chem Theory Comput. 2016 Mar 8;12(3):1368-79. doi: 10.1021/acs.jctc.5b00992. Epub 2016 Feb 12.

Improving protein fold recognition using the amalgamation of evolutionary-based and structural based information.利用基于进化和基于结构的信息融合来改进蛋白质折叠识别。

BMC Bioinformatics. 2014;15 Suppl 16(Suppl 16):S12. doi: 10.1186/1471-2105-15-S16-S12. Epub 2014 Dec 8.

NMRDSP: an accurate prediction of protein shape strings from NMR chemical shifts and sequence data.NMRDSP：基于核磁共振化学位移和序列数据对蛋白质形状字符串进行准确预测

PLoS One. 2013 Dec 23;8(12):e83532. doi: 10.1371/journal.pone.0083532. eCollection 2013.

Detecting protein candidate fragments using a structural alphabet profile comparison approach.利用结构字母表谱比较方法检测蛋白质候选片段。

PLoS One. 2013 Nov 26;8(11):e80493. doi: 10.1371/journal.pone.0080493. eCollection 2013.

A strategy to select suitable physicochemical attributes of amino acids for protein fold recognition.氨基酸理化属性选择用于蛋白质折叠识别的策略。

BMC Bioinformatics. 2013 Jul 24;14:233. doi: 10.1186/1471-2105-14-233.

Local conformational changes in the DNA interfaces of proteins.蛋白质 DNA 界面的局部构象变化。

PLoS One. 2013;8(2):e56080. doi: 10.1371/journal.pone.0056080. Epub 2013 Feb 13.

Structural alphabets derived from attractors in conformational space.从构象空间中的吸引子中导出的结构字母表。

BMC Bioinformatics. 2010 Feb 20;11:97. doi: 10.1186/1471-2105-11-97.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用结构字母表增强蛋白质折叠识别

Enhanced protein fold recognition using a structural alphabet.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献