Huerta Araceli M, Francino M Pilar, Morett Enrique, Collado-Vides Julio
Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, México.
PLoS Genet. 2006 Nov 10;2(11):e185. doi: 10.1371/journal.pgen.0020185. Epub 2006 Sep 12.
The evolutionary processes operating in the DNA regions that participate in the regulation of gene expression are poorly understood. In Escherichia coli, we have established a sequence pattern that distinguishes regulatory from nonregulatory regions. The density of promoter-like sequences, that could be recognizable by RNA polymerase and may function as potential promoters, is high within regulatory regions, in contrast to coding regions and regions located between convergently transcribed genes. Moreover, functional promoter sites identified experimentally are often found in the subregions of highest density of promoter-like signals, even when individual sites with higher binding affinity for RNA polymerase exist elsewhere within the regulatory region. In order to see the generality of this pattern, we have analyzed 43 additional genomes belonging to most established bacterial phyla. Differential densities between regulatory and nonregulatory regions are detectable in most of the analyzed genomes, with the exception of those that have evolved toward extreme genome reduction. Thus, presence of this pattern follows that of genes and other genomic features that require weak selection to be effective in order to persist. On this basis, we suggest that the loss of differential densities in the reduced genomes of host-restricted pathogens and symbionts is an outcome of the process of genome degradation resulting from the decreased efficiency of purifying selection in highly structured small populations. This implies that the differential distribution of promoter-like signals between regulatory and nonregulatory regions detected in large bacterial genomes confers a significant, although small, fitness advantage. This study paves the way for further identification of the specific types of selective constraints that affect the organization of regulatory regions and the overall distribution of promoter-like signals through more detailed comparative analyses among closely related bacterial genomes.
人们对参与基因表达调控的DNA区域中所发生的进化过程了解甚少。在大肠杆菌中,我们已经建立了一种序列模式,可区分调控区域和非调控区域。与编码区域以及同向转录基因之间的区域相比,调控区域内可被RNA聚合酶识别并可能作为潜在启动子发挥作用的类启动子序列密度很高。此外,实验确定的功能性启动子位点常常出现在类启动子信号密度最高的子区域,即便调控区域内其他地方存在对RNA聚合酶具有更高结合亲和力的单个位点。为了探究这种模式的普遍性,我们分析了另外43个属于大多数已确定细菌门类的基因组。在大多数分析的基因组中都能检测到调控区域和非调控区域之间的密度差异,那些朝着极端基因组精简方向进化的基因组除外。因此,这种模式的存在与那些需要弱选择才能有效存续的基因及其他基因组特征的情况一致。基于此,我们认为宿主受限病原体和共生体精简基因组中密度差异的丧失是高度结构化小群体中纯化选择效率降低导致基因组退化过程的结果。这意味着在大型细菌基因组中检测到的调控区域和非调控区域之间类启动子信号的差异分布赋予了显著但微小的适应性优势。这项研究为通过更详细地比较密切相关细菌基因组之间的差异,进一步确定影响调控区域组织和类启动子信号整体分布的特定类型选择限制铺平了道路。