Daniels D L, Plunkett G, Burland V, Blattner F R
Laboratory of Genetics, University of Wisconsin, Madison 53706.
Science. 1992 Aug 7;257(5071):771-8. doi: 10.1126/science.1379743.
The DNA sequence of 91.4 kilobases of the Escherichia coli K-12 genome, spanning the region between rrnC at 84.5 minutes and rrnA at 86.5 minutes on the genetic map (85 to 87 percent on the physical map), is described. Analysis of this sequence identified 82 potential coding regions (open reading frames) covering 84 percent of the sequenced interval. The arrangement of these open reading frames, together with the consensus promoter sequences and terminator-like sequences found by computer searches, made it possible to assign them to proposed transcriptional units. More than half the open reading frames correlated with known genes or functions suggested by similarity to other sequences. Those remaining encode still unidentified proteins. The sequenced region also contains several RNA genes and two types of repeated sequence elements were found. Intergenic regions include three "gray holes," 0.6 to 0.8 kilobases, with no recognizable functions.
本文描述了大肠杆菌K-12基因组91.4千碱基的DNA序列,该序列跨越遗传图谱上84.5分钟处的rrnC和86.5分钟处的rrnA之间的区域(物理图谱上的85%至87%)。对该序列的分析确定了82个潜在的编码区域(开放阅读框),覆盖了测序区间的84%。这些开放阅读框的排列,以及通过计算机搜索发现的共有启动子序列和类终止子序列,使得将它们分配到提议的转录单元成为可能。超过一半的开放阅读框与已知基因或通过与其他序列相似性暗示的功能相关。其余的编码仍未鉴定的蛋白质。测序区域还包含几个RNA基因,并发现了两种重复序列元件。基因间区域包括三个0.6至0.8千碱基的“灰色空洞”,没有可识别的功能。