Int Iberian Nanotechnol Laboratory INL, Braga, Portugal.
PLoS One. 2012;7(7):e41322. doi: 10.1371/journal.pone.0041322. Epub 2012 Jul 26.
The protein structure is a cumulative result of interactions between amino acid residues interacting with each other through space and/or chemical bonds. Despite the large number of high resolution protein structures, the "protein structure code" has not been fully identified. Our manuscript presents a novel approach to protein structure analysis in order to identify rules for spatial packing of amino acid pairs in proteins. We have investigated 8706 high resolution non-redundant protein chains and quantified amino acid pair interactions in terms of solvent accessibility, spatial and sequence distance, secondary structure, and sequence length. The number of pairs found in a particular environment is stored in a cell in an 8 dimensional data tensor. When plotting the cell population against the number of cells that have the same population size, a scale free organization is found. When analyzing which amino acid paired residues contributed to the cells with a population above 50, pairs of Ala, Ile, Leu and Val dominate the results. This result is statistically highly significant. We postulate that such pairs form "structural stability points" in the protein structure. Our data shows that they are in buried α-helices or β-strands, in a spatial distance of 3.8-4.3Å and in a sequence distance >4 residues. We speculate that the scale free organization of the amino acid pair interactions in the 8D protein structure combined with the clear dominance of pairs of Ala, Ile, Leu and Val is important for understanding the very nature of the protein structure formation. Our observations suggest that protein structures should be considered as having a higher dimensional organization.
蛋白质结构是氨基酸残基之间通过空间和/或化学键相互作用的累积结果。尽管有大量的高分辨率蛋白质结构,但“蛋白质结构密码”尚未完全确定。我们的手稿提出了一种新的蛋白质结构分析方法,以确定蛋白质中氨基酸对空间堆积的规则。我们研究了 8706 条高分辨率非冗余蛋白质链,并根据溶剂可及性、空间和序列距离、二级结构和序列长度量化了氨基酸对相互作用。在特定环境中发现的对的数量存储在 8 维数据张量中的一个单元格中。当将细胞群体与具有相同群体大小的细胞数量进行绘图时,会发现无标度组织。当分析哪些氨基酸配对残基对群体大小超过 50 的细胞有贡献时,Ala、Ile、Leu 和 Val 对的配对占据主导地位。这一结果具有统计学意义。我们假设这样的对在蛋白质结构中形成“结构稳定点”。我们的数据表明,它们位于埋藏的α-螺旋或β-折叠中,空间距离为 3.8-4.3Å,序列距离>4 个残基。我们推测,8D 蛋白质结构中氨基酸对相互作用的无标度组织以及 Ala、Ile、Leu 和 Val 对的明显优势对于理解蛋白质结构形成的本质非常重要。我们的观察表明,蛋白质结构应该被认为具有更高维的组织。