Ohno S
Beckman Research Institute, City of Hope, Duarte, CA 91010.
Proc Natl Acad Sci U S A. 1988 Jun;85(12):4378-82. doi: 10.1073/pnas.85.12.4378.
Modern coding sequences are in the periodicto-chaotic transition. In the case of two related sequences for lens alpha A-crystallin and small heat shock protein, the original repeating units were heptameric in length. Accordingly, base trimers that were parts of heptameric units recurred far more frequently than those that were not included. In the crystallin coding sequence, CTG trimer recurred 21 times, and TCT and TCC trimers recurred 17 times each. By contrast, CTA and TCG, although related to the above, recurred only 4 and 3 times, respectively. It is a small wonder that 10 of the 16 leucine residues were encoded by CTG, whereas none was encoded by CTA, and that 17 of the 23 serine residues were encoded either by TCT or by TCC, whereas only 1 was encoded by TCG. In the small heat shock protein coding sequence, however, AGC became parts of the two prominent heptameric recurring units. Not surprisingly, 10 of the 22 serine residues were now encoded by AGC. In conclusion, the so-called codon preference is a mere reflection of the construction principle of coding sequences and has very little to do with selection per se.
现代编码序列正处于从周期性到混沌的转变过程中。以晶状体αA-晶状体蛋白和小热休克蛋白的两个相关序列为例,最初的重复单元长度为七聚体。因此,作为七聚体单元一部分的碱基三聚体出现的频率远高于未包含的碱基三聚体。在晶状体蛋白编码序列中,CTG三聚体出现了21次,TCT和TCC三聚体各出现了17次。相比之下,CTA和TCG虽然与上述序列相关,但分别只出现了4次和3次。难怪16个亮氨酸残基中有10个由CTG编码,而CTA没有编码任何亮氨酸残基;23个丝氨酸残基中有17个由TCT或TCC编码,而TCG只编码了1个丝氨酸残基。然而,在小热休克蛋白编码序列中,AGC成为了两个突出的七聚体重复单元的一部分。不出所料,22个丝氨酸残基中有10个现在由AGC编码。总之,所谓的密码子偏好仅仅反映了编码序列的构建原则,与选择本身关系不大。