Suppr超能文献

分析局部结构原型库的序列-结构关系。

Analyzing the sequence-structure relationship of a library of local structural prototypes.

作者信息

Benros Cristina, de Brevern Alexandre G, Hazout Serge

机构信息

Equipe de Bioinformatique Génomique et Moléculaire, INSERM UMR-S726, Université Denis Diderot-Paris 7, Place Jussieu, Paris, France.

出版信息

J Theor Biol. 2009 Jan 21;256(2):215-26. doi: 10.1016/j.jtbi.2008.08.032. Epub 2008 Oct 14.

Abstract

We present a thorough analysis of the relation between amino acid sequence and local three-dimensional structure in proteins. A library of overlapping local structural prototypes was built using an unsupervised clustering approach called "hybrid protein model" (HPM). The HPM carries out a multiple structural alignment of local folds from a non-redundant protein structure databank encoded into a structural alphabet composed of 16 protein blocks (PBs). Following previous research focusing on the HPM protocol, we have considered gaps in the local structure prototype. This methodology allows to have variable length fragments. Hence, 120 local structure prototypes were obtained. Twenty-five percent of the protein fragments learnt by HPM had gaps. An investigation of tight turns suggested that they are mainly derived from three PB series with precise locations in the HPM. The amino acid information content of the whole conformational classes was tackled by multivariate methods, e.g., canonical correlation analysis. It points out the presence of seven amino acid equivalence classes showing high propensities for preferential local structures. In the same way, definition of "contrast factors" based on sequence-structure properties underline the specificity of certain structural prototypes, e.g., the dependence of Gly or Asn-rich turns to a limited number of PBs, or, the opposition between Pro-rich coils to those enriched in Ser, Thr, Asn and Glu. These results are so useful to analyze the sequence-structure relationships, but could also be used to improve fragment-based method for protein structure prediction from sequence.

摘要

我们对蛋白质中氨基酸序列与局部三维结构之间的关系进行了全面分析。使用一种名为“混合蛋白质模型”(HPM)的无监督聚类方法构建了一个重叠局部结构原型库。HPM对来自非冗余蛋白质结构数据库的局部折叠进行多重结构比对,这些局部折叠被编码为一个由16个蛋白质块(PB)组成的结构字母表。继之前关注HPM协议的研究之后,我们考虑了局部结构原型中的空位。这种方法允许有可变长度的片段。因此,获得了120个局部结构原型。通过HPM学习的蛋白质片段中有25%有空位。对紧密转角的研究表明,它们主要来自HPM中具有精确位置的三个PB系列。通过多变量方法,如典型相关分析,处理了整个构象类别的氨基酸信息含量。结果指出存在七个氨基酸等价类,它们对优先局部结构具有很高的倾向。同样,基于序列 - 结构特性定义的“对比因子”强调了某些结构原型的特异性,例如富含甘氨酸或天冬酰胺的转角对有限数量PB的依赖性,或者富含脯氨酸的卷曲与富含丝氨酸、苏氨酸、天冬酰胺和谷氨酸的卷曲之间的对立。这些结果对于分析序列 - 结构关系非常有用,但也可用于改进基于片段的从序列预测蛋白质结构的方法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验