Dreyfus M
Laboratoire de Génétique Moleculaire, Ecole Normale Supérieure, Paris, France.
J Mol Biol. 1988 Nov 5;204(1):79-94. doi: 10.1016/0022-2836(88)90601-8.
Small DNA fragments (60 to 80 nucleotides), randomly obtained from a collection of 14 catabolic, biosynthetic or regulatory Escherichia coli genes, have been shot-gun cloned in place of the lacZ ribosome binding site. A total of 47 recombinants showing substantial beta-galactosidase synthesis (at least 1/30th of the wild-type) were isolated, and their newly acquired translational starts were characterized. Of these, 46 were found to carry a ribosome binding site from one of the original genes, and only one, a non-natural start. Moreover, 12 out of the 14 natural starts were found. The two that were not found are the only ones lacking a Shine-Dalgarno element. So, real starts are generally active in the lac mRNA, whereas the many sites (approx. 100 in this gene collection) that carry a Shine-Dalgarno element followed by AUG or GUG but are located in intra- or intergenic regions, or on non-transcribed strands, are inactive. I conclude that: (1) these "false" starts, being strongly discriminated against in the lac message, are presumably also inactive in their original mRNAs; (2) the discriminating information, being portable from one mRNA to another, must be contained within a small DNA region surrounding the starts. Indeed, I further show that it generally lies within a sequence of about 35 nucleotides bracketing real starts; and (3) this information must have a larger effect on initiation than the exact structure of the mRNA, because the discrimination persists despite a complete change of this structure. Previous statistical analysis has shown that real starts differ from false starts in having a non-random sequence composition from nucleotides -20 to +15 with respect to the start. To uncover whether these biases constitute the discriminating information or simply reflect coding constraints, translational starts were randomly searched in eukaryotic, largely non-coding, DNA. These "eukaryotic" starts all have an in-phase AUG or GUG, preceded by a typical Shine-Dalgarno sequence; outside these elements, the initiator region is strikingly rich in A, and poor in C. These biases match those found around real starts, demonstrating that they are indeed part of the initiation signal. Finally, I describe a simple procedure for introducing any DNA fragment in place of the lac operator site on the E. coli chromosome.
从小的DNA片段(60至80个核苷酸)中随机选取,这些片段来自14个参与分解代谢、生物合成或调控的大肠杆菌基因,已被用鸟枪法克隆,取代了lacZ核糖体结合位点。总共分离出47个显示大量β-半乳糖苷酶合成(至少为野生型的1/30)的重组体,并对它们新获得的翻译起始位点进行了表征。其中,46个被发现携带来自原始基因之一的核糖体结合位点,只有一个是一个非天然起始位点。此外,还发现了14个天然起始位点中的12个。未发现的两个是仅有的缺少Shine-Dalgarno元件的位点。因此,真正的起始位点通常在lac mRNA中具有活性,而许多携带Shine-Dalgarno元件且后面跟着AUG或GUG,但位于基因内或基因间区域、或非转录链上的位点(在这个基因集合中约有100个)是无活性的。我的结论是:(1)这些“假”起始位点在lac信息中受到强烈的区分,推测在它们原来的mRNA中也是无活性的;(2)这种区分信息可以从一种mRNA转移到另一种mRNA,一定包含在围绕起始位点的一个小DNA区域内。实际上,我进一步表明它通常位于围绕真正起始位点的约35个核苷酸的序列内;(3)这种信息对起始的影响一定比mRNA的确切结构更大,因为尽管mRNA结构完全改变,这种区分仍然存在。先前统计分析表明,真正的起始位点与假起始位点的区别在于,相对于起始位点,从核苷酸-20到+15具有非随机的序列组成。为了揭示这些偏差是构成区分信息还是仅仅反映编码限制,在真核生物的、基本上是非编码的DNA中随机搜索翻译起始位点。这些“真核生物”起始位点都有一个相位匹配的AUG或GUG,前面有一个典型的Shine-Dalgarno序列;在这些元件之外,起始区域A含量显著丰富,C含量很少。这些偏差与在真正起始位点周围发现的偏差相匹配,表明它们确实是起始信号的一部分。最后,我描述了一种简单的程序,可以将任何DNA片段引入大肠杆菌染色体上lac操纵基因位点的位置。