Kozak M
Nucleic Acids Res. 1984 Jan 25;12(2):857-72. doi: 10.1093/nar/12.2.857.
5-Noncoding sequences have been tabulated for 211 messenger RNAs from higher eukaryotic cells. The 5'-proximal AUG triplet serves as the initiator codon in 95% of the mRNAs examined. The most conspicuous conserved feature is the presence of a purine (most often A) three nucleotides upstream from the AUG initiator codon; only 6 of the mRNAs in the survey have a pyrimidine in that position. There is a predominance of C in positions -1, -2, -4 and -5, just upstream from the initiator codon. The sequence CCAGCCAUG (G) thus emerges as a consensus sequence for eukaryotic initiation sites. The extent to which the ribosome binding site in a given mRNA matches the -1 to -5 consensus sequence varies: more than half of the mRNAs in the tabulation have 3 or 4 nucleotides in common with the CCACC consensus, but only ten mRNAs conform perfectly.
已将来自高等真核细胞的211种信使核糖核酸的5′非编码序列制成表格。在所检测的信使核糖核酸中,95%的5′近端AUG三联体用作起始密码子。最显著的保守特征是在AUG起始密码子上游三个核苷酸处存在一个嘌呤(最常见的是A);在所调查的信使核糖核酸中,只有6种在该位置有嘧啶。在起始密码子上游的-1、-2、-4和-5位置,C占优势。因此,序列CCAGCCAUG(G)成为真核生物起始位点的共有序列。给定信使核糖核酸中的核糖体结合位点与-1至-5共有序列的匹配程度各不相同:表格中超过一半的信使核糖核酸与CCACC共有序列有3个或4个核苷酸相同,但只有10种信使核糖核酸完全符合。