Donate L E, Rufino S D, Canard L H, Blundell T L
Imperial Cancer Research Fund, Department of Crystallography, Birkbeck College, University of London, United Kingdom.
Protein Sci. 1996 Dec;5(12):2600-16. doi: 10.1002/pro.5560051223.
Loops are regions of nonrepetitive conformation connecting regular secondary structures. We identified 2,024 loops of one to eight residues in length, with acceptable main-chain bond lengths and peptide bond angles, from a database of 223 protein and protein-domain structures. Each loop is characterized by its sequence, main-chain conformation, and relative disposition of its bounding secondary structures as described by the separation between the tips of their axes and the angle between them. Loops, grouped according to their length and type of their bounding secondary structures, were superposed and clustered into 161 conformational classes, corresponding to 63% of all loops. Of these, 109 (51% of the loops) were populated by at least four nonhomologous loops or four loops sharing a low sequence identity. Another 52 classes, including 12% of the loops, were populated by at least three loops of low sequence similarity from three or fewer nonhomologous groups. Loop class suprafamilies resulting from variations in the termini of secondary structures are discussed in this article. Most previously described loop conformations were found among the classes. New classes included a 2:4 type IV hairpin, a helix-capping loop, and a loop that mediates dinucleotide-binding. The relative disposition of bounding secondary structures varies among loop classes, with some classes such as beta-hairpins being very restrictive. For each class, sequence preferences as key residues were identified; those most frequently at these conserved positions than in proteins were Gly, Asp, Pro, Phe, and Cys. Most of these residues are involved in stabilizing loop conformation, often through a positive phi conformation or secondary structure capping. Identification of helix-capping residues and beta-breakers among the highly conserved positions supported our decision to group loops according to their bounding secondary structures. Several of the identified loop classes were associated with specific functions, and all of the member loops had the same function; key residues were conserved for this purpose, as is the case for the parvalbumin-like calcium-binding loops. A significant number, but not all, of the member loops of other loop classes had the same function, as is the case for the helix-turn-helix DNA-binding loops. This article provides a systematic and coherent conformational classification of loops, covering a broad range of lengths and all four combinations of bounding secondary structure types, and supplies a useful basis for modelling of loop conformations where the bounding secondary structures are known or reliably predicted.
环是连接规则二级结构的非重复构象区域。我们从一个包含223个蛋白质和蛋白质结构域结构的数据库中,识别出了2024个长度为1至8个残基的环,这些环具有可接受的主链键长和肽键角。每个环的特征在于其序列、主链构象以及其边界二级结构的相对排列,这通过它们轴端之间的距离和它们之间的角度来描述。根据环的长度和边界二级结构的类型进行分组,将这些环进行叠加并聚类为161个构象类别,占所有环的63%。其中,109个类别(占环的51%)至少由四个非同源环或四个序列同一性较低的环组成。另外52个类别,包括12%的环,至少由来自三个或更少非同源组的三个序列相似性较低的环组成。本文讨论了由二级结构末端变化产生的环类别超家族。在这些类别中发现了大多数先前描述的环构象。新的类别包括2:4型IV发夹、螺旋封端环和介导二核苷酸结合的环。边界二级结构的相对排列在不同环类别中有所不同,一些类别如β-发夹非常受限。对于每个类别,确定了作为关键残基的序列偏好;在这些保守位置上比在蛋白质中出现频率更高的是甘氨酸、天冬氨酸、脯氨酸、苯丙氨酸和半胱氨酸。这些残基中的大多数通过正的φ构象或二级结构封端参与稳定环构象。在高度保守位置中识别螺旋封端残基和β-断裂残基支持了我们根据边界二级结构对环进行分组的决定。几个已识别的环类别与特定功能相关,并且所有成员环都具有相同的功能;为此关键残基是保守的,如类小白蛋白样钙结合环的情况。其他环类别的成员环中有相当数量(但不是全部)具有相同的功能,如螺旋-转角-螺旋DNA结合环的情况。本文提供了一个系统且连贯的环构象分类,涵盖了广泛的长度范围以及边界二级结构类型的所有四种组合,并为在已知或可靠预测边界二级结构的情况下进行环构象建模提供了有用的基础。