Rost B
EMBL, Heidelberg, Germany.
Fold Des. 1997;2(3):S19-24. doi: 10.1016/s1359-0278(97)00059-x.
A protein sequence folds into a unique three-dimensional protein structure. Different sequences, though, can fold into similar structures. How stable is a protein structure with respect to sequence changes? What percentage of the sequence is 'anchor' residues, that is, residues crucial for protein structure and function? Here, answers to these questions are pursued by analyzing large numbers of structurally homologous protein pairs. Most pairs of similar structures have sequence identity as low as expected from randomly related sequences (8-9%). On average, only 3-4% of all residues are 'anchor' residues. The symmetric shape of the distribution at low sequence identity suggests that for most structures, four billion years of evolution was sufficient to reach an equilibrium. The mean identities for convergent (different ancestor) and divergent (same ancestor) evolution of proteins to similar structures are quite close and hence, in most cases, it is difficult to distinguish between the two effects. In particular, low levels of sequence identity appear not to be indicative of convergent evolution.
蛋白质序列会折叠成独特的三维蛋白质结构。然而,不同的序列可以折叠成相似的结构。蛋白质结构相对于序列变化的稳定性如何?序列中“锚定”残基(即对蛋白质结构和功能至关重要的残基)占比多少?在这里,通过分析大量结构同源的蛋白质对来探寻这些问题的答案。大多数相似结构的蛋白质对的序列同一性低至随机相关序列所预期的水平(8 - 9%)。平均而言,所有残基中只有3 - 4%是“锚定”残基。低序列同一性时分布的对称形状表明,对于大多数结构来说,四十亿年的进化足以达到平衡。蛋白质向相似结构的趋同(不同祖先)进化和分歧(相同祖先)进化的平均同一性相当接近,因此,在大多数情况下,很难区分这两种效应。特别是,低水平的序列同一性似乎并不表明是趋同进化。