Heringa J, Argos P
European Molecular Biology Laboratory, Heidelberg, Germany.
J Mol Biol. 1991 Jul 5;220(1):151-71. doi: 10.1016/0022-2836(91)90388-m.
A method has been developed to detect dense clusters of residue side-chains in proteins, where contact is based upon the percentage of the maximum possible for a given residue type. The clusters represent protein sites with the highest degree of interaction amongst their member residues, while contacts with the environment surrounding the cluster are lower in number. The method has been applied to three distinct structural sets of proteins to check for consistency: mixed alpha-helical/beta-sheet proteins, all beta-strand proteins, and all alpha-helical proteins. A number of cluster features generated from these sets are of general interest for protein folding. (1) A majority of the clusters, comprising three to four residues on average, are localized near the protein surfaces and not within the protein cores. (2) The clusters have preferences for the N- and C-terminal ends of alpha-helices and beta-strands in alpha/beta and alpha-proteins, while beta-proteins utilize the middle strand regions more often. A number of clusters connect three or more beta-strands and/or alpha-helices. (3) More than half of the clusters display residue pairs with oppositely charged atoms within 4.5 A of each other. (4) The residue composition of the clusters does not show correlation with hydrophobicity measures but rather with side-chain volume and surface. The highly preferred cluster residues are (in order of decreasing preference) Trp, His, Arg, Tyr, Glu, Gln and Phe. Clusters with extensive internal contacts in related haemoglobin and immunoglobulin tertiary structures show respective conservation. Several examples illustrate "strategic" folding positions in proteins that often bring together a number of sheets and/or helices, suggesting a folding model in which largely preformed secondary structures are joined together in a cluster induced collapse. Alternatively, the clusters may form at some stage in the folding process to reduce considerably the searchable conformational space and help maintain the proper folding pathway. The clusters also provide hints for site-directed mutagenesis and protein engineering experiments as they are also suggested to be important for structural stability.
已开发出一种方法来检测蛋白质中残基侧链的密集簇,其中接触是基于给定残基类型的最大可能百分比。这些簇代表其成员残基之间具有最高相互作用程度的蛋白质位点,而与簇周围环境的接触数量较少。该方法已应用于三组不同结构的蛋白质以检查其一致性:混合α螺旋/β折叠蛋白质、全β链蛋白质和全α螺旋蛋白质。从这些蛋白质组中生成的许多簇特征对于蛋白质折叠具有普遍意义。(1)大多数簇平均包含三到四个残基,位于蛋白质表面附近而非蛋白质核心内。(2)在α/β和α蛋白质中,簇更倾向于位于α螺旋和β折叠的N端和C端,而β蛋白质更常利用中间链区域。许多簇连接三个或更多的β链和/或α螺旋。(3)超过一半的簇显示出彼此距离在4.5埃以内且带相反电荷原子的残基对。(4)簇的残基组成与疏水性度量无关,而是与侧链体积和表面有关。高度优先的簇残基依次为(按优先程度递减)色氨酸、组氨酸、精氨酸、酪氨酸、谷氨酸、谷氨酰胺和苯丙氨酸。在相关血红蛋白和免疫球蛋白三级结构中具有广泛内部接触的簇显示出各自的保守性。几个例子说明了蛋白质中“策略性”的折叠位置,这些位置常常将许多折叠片和/或螺旋聚集在一起,这表明了一种折叠模型,即大量预先形成的二级结构在簇诱导的塌缩中连接在一起。或者,簇可能在折叠过程的某个阶段形成,以显著减少可搜索的构象空间并有助于维持正确的折叠途径。这些簇还为定点诱变和蛋白质工程实验提供了线索,因为它们也被认为对结构稳定性很重要。