Department of Computer Science, ICube, CNRS, University of Strasbourg, Strasbourg 67000, France.
Unité de Microbiologie Structurale, Institut Pasteur, CNRS, 75724 Paris Cedex 15, France.
RNA. 2019 Dec;25(12):1714-1730. doi: 10.1261/rna.072074.119. Epub 2019 Sep 10.
The origin of the genetic code remains enigmatic five decades after it was elucidated, although there is growing evidence that the code coevolved progressively with the ribosome. A number of primordial codes were proposed as ancestors of the modern genetic code, including comma-free codes such as the , , or codes ( = G or A, = C or T, = any nucleotide), and the circular code, an error-correcting code that also allows identification and maintenance of the reading frame. It was demonstrated previously that motifs of the circular code are significantly enriched in the protein-coding genes of most organisms, from bacteria to eukaryotes. Here, we show that imprints of this code also exist in the ribosomal RNA (rRNA). In a large-scale study involving 133 organisms representative of the three domains of life, we identified 32 universal motifs that are conserved in the rRNA of >90% of the organisms. Intriguingly, most of the universal motifs are located in rRNA regions involved in important ribosome functions, notably in the peptidyl transferase center and the decoding center that form the original "proto-ribosome." Building on the existing accretion models for ribosome evolution, we propose that error-correcting circular codes represented an important step in the emergence of the modern genetic code. Thus, circular codes would have allowed the simultaneous coding of amino acids and synchronization of the reading frame in primitive translation systems, prior to the emergence of more sophisticated start codon recognition and translation initiation mechanisms.
遗传密码的起源在其被阐明后的五十年里仍然是个谜,尽管越来越多的证据表明密码与核糖体是协同进化的。一些原始密码被提出作为现代遗传密码的祖先,包括无逗号密码,如 、 或 码(=G 或 A,=C 或 T,=任何核苷酸),以及 圆形码,这是一种纠错码,也允许识别和维持阅读框架。先前已经证明,大多数生物体(从细菌到真核生物)的蛋白质编码基因中都显著富集了 圆形码的模体。在这里,我们表明这个密码的印记也存在于核糖体 RNA(rRNA)中。在一项涉及代表生命三个领域的 133 个生物体的大规模研究中,我们鉴定了 32 个普遍存在的 模体,它们在>90%的生物体的 rRNA 中保守。有趣的是,大多数通用的 模体位于与核糖体功能相关的 rRNA 区域,特别是在肽基转移酶中心和解码中心,它们构成了原始的“原核糖体”。在现有的核糖体进化累积模型的基础上,我们提出纠错的圆形码是现代遗传密码出现的一个重要步骤。因此,在更复杂的起始密码子识别和翻译起始机制出现之前,圆形码可能允许在原始翻译系统中同时对氨基酸进行编码并同步阅读框架。