Al'tshteĭn A D, Efimov A V
Mol Biol (Mosk). 1988 Sep-Oct;22(5):1411-29.
A progene hypothesis has been proposed earlier to explain the mechanism of origin of the self-reproducing genetic system. Progenes (precursors of the genetic system) are mixed anhydrides of an amino acid and deoxyribotrinucleotide at the 3'-gamma-terminal phosphate (NpNpNppp-AA); they are produced from dinucleotides (NpNp) and 3'-gamma-aminoacylnucleotidylates (Nppp-AA) as a result of specific interaction between amino acid and dinucleotide. The postulated mechanism of progene formation accounts for the selection of substances, including chirality, the origin of the genetic code as well as for the mechanisms of formation, self-reproduction and evolution of the simpliest genetic system ("gene--polypeptide"). A stereochemical analysis of the progene formation mechanism has allowed us to support the main statements of the hypothesis that relate to the origin of the genetic code and to selection of substances. Atomic groups that could be responsible for the specificity of interaction between dinucleotides and amino acids in progene formation have been revealed. Stereochemical evidence for the physicochemical basis of the origin of the existing genetic code have been produced: 1) a special role of the second nucleotide in the codon is demonstrated in amino acid coding by the progene hypothesis principle; 2) an advantage of T against U in such coding is demonstrated; 3) for 16 amino acids out of 20 an agreement has been obtained between the optimal dinucleotide as revealed by the stereochemical analysis and the codon dinucleotides; 4) an explanation for the third nucleotide selection mechanism is offered. A restoration of the prebiotic code, based on these results, has indicated that the code contains 32 codons, is statistical and group-wise. It encodes 7 groups of isofunctional amino acids: 3 overlapping groups of non-polar amino acids 1) medium-size hydrophobic amino acids (chiefly Val, n-Val and a-But), 2) small and medium-size non-polar amino acids (chiefly Ala Val, n-Val a-But and Gly), 3) small non-polar amino acids (Gly, Ala, a-But) and 4 groups of polar amino acids--1) hydroxy--+dicarbonic (Asp, Glu, Ser and Thr), 2) dicarbonic (Asp and Glu), 3) hydroxy (Ser and Thr) and 4) basic (Arg and Lys). The code includes about 20 amino acids among which are 15-17 canonical and a few common non-canonical. The prebiotic code explains many properties of the existing genetic code and is capable of evolving into the latter by way of a gradual replacement of the physicochemical coding mechanism by the enzymatic coding mechanism.
之前有人提出了一个前基因假说,以解释自我复制遗传系统的起源机制。前基因(遗传系统的前体)是氨基酸与脱氧核糖三核苷酸在3'-γ-末端磷酸处形成的混合酸酐(NpNpNppp-AA);它们由二核苷酸(NpNp)和3'-γ-氨酰基核苷酸(Nppp-AA)通过氨基酸与二核苷酸之间的特异性相互作用产生。假定的前基因形成机制解释了物质的选择,包括手性、遗传密码的起源,以及最简单遗传系统(“基因-多肽”)的形成、自我复制和进化机制。对前基因形成机制的立体化学分析使我们能够支持该假说中与遗传密码起源和物质选择相关的主要观点。已经揭示了在形成前基因过程中可能负责二核苷酸与氨基酸之间相互作用特异性的原子基团。已经提供了现有遗传密码起源的物理化学基础的立体化学证据:1)根据前基因假说原理,密码子中第二个核苷酸在氨基酸编码中具有特殊作用;2)证明了在这种编码中T相对于U的优势;3)对于20种氨基酸中的16种,立体化学分析揭示的最佳二核苷酸与密码子二核苷酸之间达成了一致;4)提供了对第三个核苷酸选择机制的解释。基于这些结果对前生物密码的重建表明,该密码包含32个密码子,具有统计学意义且按组分类。它编码7组同功能氨基酸:3组重叠的非极性氨基酸1)中等大小的疏水氨基酸(主要是缬氨酸、正缬氨酸和α-丁氨酸),2)小和中等大小的非极性氨基酸(主要是丙氨酸、缬氨酸、正缬氨酸、α-丁氨酸和甘氨酸),3)小的非极性氨基酸(甘氨酸、丙氨酸、α-丁氨酸)和4组极性氨基酸——1)羟基+二羧酸(天冬氨酸、谷氨酸、丝氨酸和苏氨酸),2)二羧酸(天冬氨酸和谷氨酸),3)羟基(丝氨酸和苏氨酸)和4)碱性(精氨酸和赖氨酸)。该密码包括约20种氨基酸,其中15 - 17种是标准的,还有一些常见的非标准氨基酸。前生物密码解释了现有遗传密码的许多特性,并能够通过酶促编码机制逐渐取代物理化学编码机制而演变成后者。