Larson Scott A, Hilser Vincent J
Department of Human Biological Chemistry and Genetics, 5.162 Medical Research Bldg., University of Texas Medical Branch, Galveston, TX 77555-1068, USA.
Protein Sci. 2004 Jul;13(7):1787-801. doi: 10.1110/ps.04706204.
Classification of the amounts and types of lower order structural elements in proteins is a prerequisite to effective comparisons between protein folds. In an effort to provide an additional vehicle for fold comparison, we present an alternative classification scheme whereby protein folds are represented in statistical thermodynamic terms in such a way as to illuminate the energetic building blocks within protein structures. The thermodynamic relationship is examined between amino acid sequences and the conformational ensembles for a database of 159 Homo sapiens protein structures ranging from 50 to 250 amino acids. Using hierarchical clustering, it is shown through fold-recognition experiments that (1) eight thermodynamic environmental descriptors sufficiently accounts for the energetic variation within the native state ensembles of the H. sapiens structural database, (2) an amino acid library of only six residue types is sufficient to encode >90% of the thermodynamic information required for fold specificity in the entire database, and (3) structural resolution of the statistically derived environments reveals sequential cooperative segments throughout the protein, which are independent of secondary structure. As the first level of thermodynamic organization in proteins, these segments represent the thermodynamic counterpart to secondary structure.
对蛋白质中低阶结构元件的数量和类型进行分类,是有效比较蛋白质折叠的前提条件。为了提供一种额外的折叠比较方法,我们提出了一种替代分类方案,即从统计热力学角度来表示蛋白质折叠,以便阐明蛋白质结构中的能量构建单元。我们研究了一个包含159个人类蛋白质结构(氨基酸数量在50到250之间)的数据库中氨基酸序列与构象集合之间的热力学关系。通过折叠识别实验,利用层次聚类分析表明:(1)八个热力学环境描述符足以解释人类结构数据库天然态集合中的能量变化;(2)仅六种残基类型的氨基酸文库就足以编码整个数据库中折叠特异性所需热力学信息的90%以上;(3)统计得出的环境的结构解析揭示了整个蛋白质中的连续协同片段,这些片段与二级结构无关。作为蛋白质热力学组织的第一层次,这些片段代表了二级结构的热力学对应物。