Suppr超能文献

使用结构字母表增强蛋白质折叠识别

Enhanced protein fold recognition using a structural alphabet.

作者信息

Deschavanne Patrick, Tufféry Pierre

机构信息

Equipe de Bioinformatique Génomique et Moléculaire, INSERM UMR-S 726, Université Paris Diderot-Paris 7, F75013, Paris, France.

出版信息

Proteins. 2009 Jul;76(1):129-37. doi: 10.1002/prot.22324.

Abstract

Fold recognition from sequence can be an important step in protein structure and function prediction. Many methods have tackled this goal. Most of them, based on sequence alignment, fail for sequences of low similarity. Alignment-free approaches can provide an efficient alternative. For such approaches, the identification of efficient fold discriminatory features is critical. We propose a new fold recognition approach that relies on the encoding of the local structure of proteins using a Hidden Markov Model Structural Alphabet. This encoding provides a 1D description of the conformation of complete proteins structures, including loops. At the fold level, compared with the classical secondary structure helix, strand, and coil states, such encoding is expected to provide the means of a better discrimination between loop conformations, hence providing better fold identification. Compared with previous related approaches, this supplement of information results in significant improvement. When combining this information with supplementary information of secondary structure and residue burial, we obtain a fold recognition accuracy of 78% for 27 protein families, that is, 8% higher than the best available method so far, and of 68% for 60 families. Corresponding scores at the class level are of 92% and 90% indicating that mispredictions are mostly within structural classes.

摘要

从序列进行折叠识别可能是蛋白质结构和功能预测中的重要一步。许多方法都致力于实现这一目标。其中大多数基于序列比对的方法,对于低相似性序列会失效。无比对方法可以提供一种有效的替代方案。对于此类方法,识别有效的折叠区分特征至关重要。我们提出了一种新的折叠识别方法,该方法依赖于使用隐马尔可夫模型结构字母表对蛋白质的局部结构进行编码。这种编码提供了完整蛋白质结构构象的一维描述,包括环。在折叠水平上,与经典的二级结构螺旋、链和卷曲状态相比,这种编码有望提供更好地区分环构象的方法,从而实现更好的折叠识别。与先前的相关方法相比,这种信息补充带来了显著的改进。当将此信息与二级结构和残基埋藏的补充信息相结合时,对于27个蛋白质家族,我们获得了78%的折叠识别准确率,即比目前最好的可用方法高出8%,对于60个家族则为68%。在类别水平上的相应分数分别为92%和90%,这表明错误预测大多发生在结构类别内。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验