Cardon L R, Burge C, Schachtel G A, Blaisdell B E, Karlin S
Department of Mathematics, Stanford University, CA 94035.
Nucleic Acids Res. 1993 Aug 11;21(16):3875-84. doi: 10.1093/nar/21.16.3875.
The recent sequencing of two relatively long (approximately 100 kb) contigs of E.coli presents unique opportunities for investigating heterogeneity and genomic organization of the E.coli chromosome. We have evaluated a number of common and contrasting sequence features in the two new contigs with comparisons to all available E.coli sequences (> 1.6 Mb). Our analyses include assessments of: (i) counts and distributions of restriction sites, special oligonucleotides (e.g., Chi sites, Dam and Dcm methylase targets), and other marker arrays; (ii) significant distant and close direct and inverted repeat sequences; (iii) sequence similarities between the long contigs and other E.coli sequences; (iv) characterization and identification of rare and frequent oligonucleotides; (v) compositional biases in short oligonucleotides; and (vi) position-dependent fluctuations in sequence composition. The two contigs reveal a number of distinctive features, including: a cluster of five repeat/dyad elements with very regular spacings resembling a transcription attenuator in one of the contigs; REP elements, ERICs, and other long repeats; distinction of the Chi sequence as the most frequent oligonucleotide; regions of clustering, overdispersion, and regularity of certain restriction sites and short palindromes; and comparative domains of inhomogeneities in the two long contigs. These and other features are discussed in relation to the organization of the E.coli chromosome.
最近对大肠杆菌的两个相对较长(约100 kb)的重叠群进行测序,为研究大肠杆菌染色体的异质性和基因组组织提供了独特的机会。我们通过与所有可用的大肠杆菌序列(> 1.6 Mb)进行比较,评估了这两个新重叠群中许多常见和对比性的序列特征。我们的分析包括:(i)限制酶切位点、特殊寡核苷酸(如Chi位点、Dam和Dcm甲基化酶作用靶点)及其他标记阵列的数量和分布;(ii)显著的远距离和近距离直接及反向重复序列;(iii)长重叠群与其他大肠杆菌序列之间的序列相似性;(iv)稀有和常见寡核苷酸的表征与鉴定;(v)短寡核苷酸中的组成偏好;以及(vi)序列组成的位置依赖性波动。这两个重叠群揭示了许多独特的特征,包括:在其中一个重叠群中有一组五个重复/二分体元件,其间距非常规则,类似于转录衰减子;REP元件、ERIC元件及其他长重复序列;Chi序列作为最常见寡核苷酸的独特性;某些限制酶切位点和短回文序列的聚集、过度分散和规则性区域;以及两个长重叠群中不均匀性的比较区域。本文将结合大肠杆菌染色体的组织来讨论这些及其他特征。