Pérez-Rueda Ernesto, Janga Sarath Chandra, Martínez-Antonio Agustino
Departamento de Ingeniería Celular y Biocatálisis, IBT-UNAM. AP. 565-A, Cuernava-ca, Morelos, 62210, México.
Mol Biosyst. 2009 Dec;5(12):1494-501. doi: 10.1039/b907384a. Epub 2009 Jul 17.
The metabolic, defensive, communicative and pathogenic capabilities of eubacteria depend on their repertoire of genes and ability to regulate the expression of them. Sigma and transcription factors have fundamental roles in controlling these processes. Here, we show that sigma, transcription factors (TFs) and the number of protein coding genes occur in different magnitudes across 291 non-redundant eubacterial genomes. We suggest that these differences can be explained based on the fact that the universe of TFs, in contrast to sigma factors, exhibits a greater flexibility for transcriptional regulation, due to their ability to sense diverse stimuli through a variety of ligand-binding domains by discriminating over longer regions on DNA, through their diverse DNA-binding domains, and by their combinatorial role with other sigmas and TFs. We also note that the diversity of extra-cytoplasmic sigma factors and TF families is constrained in larger genomes. Our results indicate that most widely distributed families across eubacteria are small in size, while large families are relatively limited in their distribution across genomes. Clustering of the distribution of transcription and sigma families across genomes suggests that functional constraints could force their co-evolution, as was observed in sigma54, IHF and EBP families. Our results also indicate that large families might be a consequence of lifestyle, as pathogens and free-living organisms were found to exhibit a major proportion of these expanded families. Our results suggest that understanding proteomes from an integrated perspective, as presented in this study, can be a general framework for uncovering the relationships between different classes of proteins.
真细菌的代谢、防御、通讯和致病能力取决于其基因库以及调控这些基因表达的能力。西格玛因子和转录因子在控制这些过程中发挥着基础性作用。在此,我们表明,在291个非冗余真细菌基因组中,西格玛因子、转录因子(TFs)和蛋白质编码基因的数量呈现出不同的量级。我们认为,这些差异可以基于以下事实来解释:与西格玛因子相比,转录因子的总体在转录调控方面表现出更大的灵活性,这是因为它们能够通过各种配体结合结构域感知多种刺激,通过其多样的DNA结合结构域在更长的DNA区域进行区分,并通过它们与其他西格玛因子和转录因子的组合作用。我们还注意到,在较大的基因组中,胞外西格玛因子和转录因子家族的多样性受到限制。我们的结果表明,在真细菌中分布最广泛的家族规模较小,而大家族在基因组中的分布相对有限。基因组中转录因子和西格玛因子家族分布的聚类表明,功能限制可能迫使它们共同进化,就像在西格玛54、整合宿主因子(IHF)和增强子结合蛋白(EBP)家族中观察到的那样。我们的结果还表明,大家族可能是生活方式的结果,因为病原体和自由生活的生物体中发现有很大比例的这些扩展家族。我们的结果表明,如本研究中所呈现的,从综合角度理解蛋白质组可以成为揭示不同类别蛋白质之间关系的通用框架。