Kumar S, Bansal M
Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560 012, India.
Biophys J. 1998 Oct;75(4):1935-44. doi: 10.1016/S0006-3495(98)77634-9.
Understanding the sequence-structure relationships in globular proteins is important for reliable protein structure prediction and de novo design. Using a database of 1131 alpha-helices with nonidentical sequences from 205 nonhomologous globular protein chains, we have analyzed structural and sequence characteristics of alpha-helices. We find that geometries of more than 99% of all the alpha-helices can be simply characterised as being linear, curved, or kinked. Only a small number of alpha-helices ( approximately 4%) show sharp localized bends in their middle regions, and thus are classified as kinked. Approximately three-fourths (approximately 73%) of the alpha-helices in globular proteins show varying degrees of smooth curvature, with a mean radius of curvature of 65 +/- 33 A; longer helices are less curved. Computation of helix accessibility to the solvent indicates that nearly two-thirds of the helices ( approximately 66%) are largely buried in the protein core, and the length and geometry of the helices are not correlated with their location in the protein globule. However, the amino acid compositions and propensities of individual amino acids to occur in alpha-helices vary with their location in the protein globule, their geometries, and their lengths. In particular, Gln, Glu, Lys, and Arg are found more often in helices near the surface of globular proteins. Interestingly, kinks often seem to occur in regions where amino acids with low helix propensities (e.g., beta-branched and aromatic residues) cluster together, in addition to those associated with the occurrence of proline residues. Hence the propensities of individual amino acids to occur in a given secondary structure depend not only on conformation but also on its length, geometry, and location in the protein globule.
了解球状蛋白质中的序列 - 结构关系对于可靠的蛋白质结构预测和从头设计非常重要。利用一个包含来自205条非同源球状蛋白质链的1131个非相同序列的α - 螺旋数据库,我们分析了α - 螺旋的结构和序列特征。我们发现,所有α - 螺旋中超过99%的几何形状可简单地描述为线性、弯曲或扭结状。只有少数α - 螺旋(约4%)在其中间区域显示出尖锐的局部弯曲,因此被归类为扭结状。球状蛋白质中约四分之三(约73%)的α - 螺旋显示出不同程度的平滑曲率,平均曲率半径为65±33 Å;较长的螺旋曲率较小。螺旋对溶剂的可及性计算表明,近三分之二的螺旋(约66%)大部分埋藏在蛋白质核心中,并且螺旋的长度和几何形状与其在蛋白质球中的位置无关。然而,单个氨基酸在α - 螺旋中的组成和出现倾向随其在蛋白质球中的位置、几何形状和长度而变化。特别是,谷氨酰胺、谷氨酸、赖氨酸和精氨酸在球状蛋白质表面附近的螺旋中更常出现。有趣的是,扭结似乎经常出现在螺旋倾向较低的氨基酸(如β - 分支和芳香族残基)聚集的区域,以及与脯氨酸残基出现相关的区域。因此,单个氨基酸在给定二级结构中出现的倾向不仅取决于构象,还取决于其长度、几何形状和在蛋白质球中的位置。