Williamson R M, Hetherington J, Jackson J H
Department of Microbiology and Public Health, Michigan State University, East Lansing 48824.
J Mol Evol. 1993 Apr;36(4):347-60. doi: 10.1007/BF00182182.
The Escherichia coli K-12 genetic map was divided into intervals of equal length to count the number of genes per interval. Plots of genes per interval at four sets of interval lengths revealed large-scale clustering of genes with the major clusters occurring at regularly spaced distances apart. Major gene cluster properties were analyzed at a scale of 100 intervals wherein each interval corresponded to a genetic map unit length of 1 min. In any major gene cluster, the highest gene concentration was observed at or near the midpoint interval, and the number of genes per interval was found to decline exponentially as a function of the linear distance from the midpoint or interval of peak gene concentration of that cluster. An autocorrelation analysis of gene content in first-neighbor intervals throughout the chromosome revealed an ordered first-neighbor relationship in comparison to 2,000 randomized interval versions of the chromosome. Attempts to simulate gene placement by a Gaussian model did not produce large-scale gene clustering in any way comparable to that observed on the chromosome. We propose that major gene clusters formed from smaller gene clusters, and the contemporary chromosome formed from fusion of homologous or heterologous major gene clusters.
大肠杆菌K-12遗传图谱被划分为等长区间,以统计每个区间的基因数量。在四组区间长度下绘制每个区间的基因图,结果显示基因存在大规模聚类现象,主要聚类以规则间隔的距离出现。在100个区间的尺度上分析主要基因簇的特性,其中每个区间对应1分钟的遗传图谱单位长度。在任何主要基因簇中,在中点区间或其附近观察到最高的基因浓度,并且发现每个区间的基因数量随着与该簇基因浓度峰值的中点或区间的线性距离呈指数下降。对整个染色体上第一邻近距离区间内的基因含量进行自相关分析,结果显示与该染色体的2000个随机区间版本相比,存在有序的第一邻近距离关系。试图通过高斯模型模拟基因定位,无论如何都无法产生与在染色体上观察到相媲美的大规模基因聚类。我们提出,主要基因簇由较小的基因簇形成,而当代染色体由同源或异源主要基因簇融合形成。