Ikehara K, Amada F, Yoshida S, Mikata Y, Tanaka A
Department of Chemistry, Faculty of Science, Nara Women's University, Kita-uoya Nishi-machi, Japan.
Nucleic Acids Res. 1996 Nov 1;24(21):4249-55. doi: 10.1093/nar/24.21.4249.
Base compositions were examined at every position in codons of more than 50 genes from taxonomically different bacteria and of the corresponding antisense sequences on the bacterial genes. We propose that the nonstop frame on antisense strand [NSF(a)] of GC-rich bacterial genes is the most promising sequence for newly-born genes. Reasons are: (i) NSF(a) frequently appears on the antisense strand of GC-rich bacterial genes; (ii) base compositions at three positions in the codon are nearly symmetrical between the gene having around 55% GC content and the corresponding NSF(a); (iii) amino acid compositions of actual proteins are also similar to those of hypothetical proteins from the GC-rich NSF(a); and (iv) proteins from NSF(a) of 60% or more GC content are flexible enough to adapt to various molecules encountered as novel substrates, due to the high glycine content. To support our proposition, using a computer we generated hypothetical antisense sequences with the same base compositions as of NSF(a) at each base position in the codon, and examined properties of resulting proteins encoded by the imaginary genes. It was confirmed that NSF(a) of GC-rich gene carrying about 60% GC content is competent enough for a newly-born gene.
我们研究了来自分类学上不同细菌的50多个基因密码子中每个位置的碱基组成,以及细菌基因上相应的反义序列的碱基组成。我们提出,富含GC的细菌基因反义链上的无终止框架[NSF(a)]是新生基因最有前景的序列。原因如下:(i) NSF(a)经常出现在富含GC的细菌基因的反义链上;(ii) GC含量约为55%的基因与其相应的NSF(a)之间,密码子三个位置的碱基组成几乎对称;(iii) 实际蛋白质的氨基酸组成也与来自富含GC的NSF(a)的假设蛋白质相似;(iv) 由于甘氨酸含量高,GC含量为60%或更高的NSF(a)所编码的蛋白质具有足够的灵活性,能够适应作为新底物遇到的各种分子。为了支持我们的观点,我们使用计算机生成了在密码子的每个碱基位置与NSF(a)具有相同碱基组成的假设反义序列,并研究了由这些假想基因编码的蛋白质的特性。结果证实,GC含量约为60%的富含GC的基因的NSF(a)足以成为一个新生基因。