Route Cantonale 103, Saint Sulpice VD, Switzerland.
J Theor Biol. 2014 Apr 21;347:95-108. doi: 10.1016/j.jtbi.2014.01.002. Epub 2014 Jan 14.
Evolution of the genetic code in an early RNA world is dependent on the steadily improving specificity of the coevolving protein synthesis machinery for codons, anticodons, tRNAs and amino acids. In the beginning, there is RNA but the machinery does not distinguish yet between the codons, which therefore all encode the same information. Synonymous codons are equivalent under a symmetry group that exchanges (permutes) the codons without affecting the code. The initial group changes any codon into any other by permuting the order of the bases in the triplet as well as by replacing the four RNA bases with each other at every codon position. This group preserves the differences between codons, known as Hamming distances, with a 1-distance corresponding to a single point mutation. Stepwise breaking of the group into subgroups divides the 64 codons into progressively smaller subsets - blocks of equivalent codons under the smaller symmetry groups, with each block able to encode a different message. This formalism prescribes how the evolving machinery increasingly differentiates between codons. The model indicates that primitive ribosomes first identified a unique mRNA reading frame to break the group permuting the order of the bases and subsequently enforced increasingly stringent codon-anticodon basepairing rules to break the subgroups permuting the four bases at each codon position. The modern basepairing rules evolve in five steps and at each step the number of codon blocks doubles. The fourth step generates 16 codon blocks corresponding with the 16 family boxes of the standard code and the last step splits these boxes into 32 blocks of commonly two, but rarely one or three, synonymous codons. The evolving codes transmit at most one message per codon block and as the number of messages increases so does the specificity of the code and of protein synthesis. The selective advantage conferred by better functioning proteins drives the symmetry breaking process. Over time paralogous tRNA evolution expands the anticodon repertoire, which is divided into anticodon blocks matching the codon blocks under the stage-specific ribosomal basepairing rules. Contemporaneously an expanding family of primitive aminoacyl-tRNA synthetases (aaRSs) divides the tRNA diversities into various different and overlapping subsets: each aaRS accepts some tRNAs but rejects all others and several aaRSs may accept the same tRNA species. Selection favoring less ambiguous codes eliminates these overlaps and also imposes the ribosomal anticodon block division as ambiguity arises when different aaRSs accept tRNAs of the same anticodon block. Only when the tRNAs of one or several anticodon blocks are accepted by a unique aaRS does the code become specific. This coding pattern is observed in the standard code and the evolution of amino acid assignments by primitive aaRSs onto tRNAs is traced back via tRNA trees that picture a gradual division of tRNA diversities into blocks with increasingly specific amino acid assignments. Symmetry breaking combined with continuous selection for codes carrying more information evolves increasingly specific codes and efficiently traverses an immense space of all possible codes (>10(84)) to give rise to the standard code.
在早期的 RNA 世界中,遗传密码的进化取决于共同进化的蛋白质合成机制对密码子、反密码子、tRNA 和氨基酸的特异性的不断提高。在最初阶段,只有 RNA,但该机制尚未区分密码子,因此所有密码子都编码相同的信息。同义密码子在一个对称群下是等价的,该对称群通过交换(排列)密码子而不影响密码子来交换(排列)密码子。最初的群体会通过改变三联体中碱基的顺序以及在每个密码子位置用四个 RNA 碱基相互替换来将任何密码子转换为任何其他密码子。该群体会保留密码子之间的差异,称为汉明距离,1 距离对应于单个点突变。该群体会逐步分解为子群,将 64 个密码子分成越来越小的子集——在较小的对称群下等效的密码子块,每个块都可以编码不同的信息。这种形式主义规定了不断进化的机器如何越来越多地区分密码子。该模型表明,原始核糖体首先确定一个独特的 mRNA 阅读框来打破排列碱基的群集,然后强制实施越来越严格的密码子-反密码子碱基配对规则来打破每个密码子位置排列四个碱基的子群。现代碱基配对规则分五步进化,每一步都会使密码子块的数量翻倍。第四步生成 16 个密码子块,对应标准密码的 16 个家族盒,最后一步将这些盒分为 32 个块,通常为两个,但很少为一个或三个同义密码子。不断进化的密码子在每个密码子块中最多传递一条信息,随着信息数量的增加,密码子和蛋白质合成的特异性也会增加。由更好功能的蛋白质赋予的选择优势推动了对称破缺过程。随着时间的推移,同源 tRNA 的进化扩展了反密码子库,该库根据特定核糖体碱基配对规则划分为与密码子块匹配的反密码子块。同时,原始氨酰-tRNA 合成酶(aaRS)家族的扩展将 tRNA 多样性划分为各种不同且重叠的子集:每个 aaRS 接受一些 tRNA,但拒绝所有其他 tRNA,并且几个 aaRS 可能接受相同的 tRNA 种类。选择有利于较少歧义的密码子消除了这些重叠,并强制进行核糖体反密码子块划分,因为当不同的 aaRS 接受相同反密码子块的 tRNA 时会出现歧义。只有当一个或几个反密码子块的 tRNA 被一个独特的 aaRS 接受时,密码子才变得具有特异性。这种编码模式在标准密码中观察到,并且通过原始 aaRS 对 tRNA 的氨基酸分配的进化可以通过描绘 tRNA 多样性逐渐分为具有越来越具体氨基酸分配的块的 tRNA 树追溯到。对称破缺结合对携带更多信息的密码子的持续选择,进化出越来越具体的密码子,并有效地遍历了所有可能的密码子(>10^84)的巨大空间,从而产生了标准密码。