Kozak M
Department of Biological Sciences, University of Pittsburgh, PA 15260.
Nucleic Acids Res. 1987 Oct 26;15(20):8125-48. doi: 10.1093/nar/15.20.8125.
5'-Noncoding sequences have been compiled from 699 vertebrate mRNAs. (GCC) GCCA/GCCATGG emerges as the consensus sequence for initiation of translation in vertebrates. The most highly conserved position in that motif is the purine in position -3 (three nucleotides upstream from the ATG codon); 97% of vertebrate mRNAs have a purine, most often A, in that position. The periodical occurrence of G (in positions -3, -6, -9) is discussed. Upstream ATG codons occur in fewer than 10% of vertebrate mRNAs-at-large; a notable exception are oncogene transcripts, two-thirds of which have ATG codons preceding the start of the major open reading frame. The leader sequences of most vertebrate mRNAs fall in the size range of 20 to 100 nucleotides. The significance of shorter and longer 5'-noncoding sequences is discussed.
已从699种脊椎动物的信使核糖核酸中编译出5′非编码序列。(GCC)GCCA/GCCATGG成为脊椎动物翻译起始的共有序列。该基序中最保守的位置是-3位的嘌呤(ATG密码子上游三个核苷酸);97%的脊椎动物信使核糖核酸在该位置有一个嘌呤,最常见的是A。讨论了G(在-3、-6、-9位)的周期性出现。上游ATG密码子在不到10%的脊椎动物信使核糖核酸中出现;一个显著的例外是癌基因转录本,其中三分之二在主要开放阅读框起始之前有ATG密码子。大多数脊椎动物信使核糖核酸的前导序列长度在20到100个核苷酸范围内。讨论了较短和较长的5′非编码序列的意义。