MRC Laboratory of Molecular Biology, Cambridge CB2 0QH, UK.
Nucleic Acids Res. 2010 Nov;38(21):7364-77. doi: 10.1093/nar/gkq617. Epub 2010 Jul 30.
Sequence-specific transcription factors (TFs) are important to genetic regulation in all organisms because they recognize and directly bind to regulatory regions on DNA. Here, we survey and summarize the TF resources available. We outline the organisms for which TF annotation is provided, and discuss the criteria and methods used to annotate TFs by different databases. By using genomic TF repertoires from ∼700 genomes across the tree of life, covering Bacteria, Archaea and Eukaryota, we review TF abundance with respect to the number of genes, as well as their structural complexity in diverse lineages. While typical eukaryotic TFs are longer than the average eukaryotic proteins, the inverse is true for prokaryotes. Only in eukaryotes does the same family of DNA-binding domain (DBD) occur multiple times within one polypeptide chain. This potentially increases the length and diversity of DNA-recognition sequence by reusing DBDs from the same family. We examined the increase in TF abundance with the number of genes in genomes, using the largest set of prokaryotic and eukaryotic genomes to date. As pointed out before, prokaryotic TFs increase faster than linearly. We further observe a similar relationship in eukaryotic genomes with a slower increase in TFs.
序列特异性转录因子(TFs)对于所有生物体的遗传调控都很重要,因为它们可以识别并直接结合到 DNA 上的调控区域。在这里,我们调查并总结了可用的 TF 资源。我们概述了提供 TF 注释的生物体,并讨论了不同数据库用于注释 TF 的标准和方法。通过使用来自生命之树中约 700 个基因组的基因组 TF 库,涵盖细菌、古菌和真核生物,我们根据基因数量以及它们在不同谱系中的结构复杂性来审查 TF 的丰度。虽然典型的真核 TF 比平均真核蛋白长,但原核生物则相反。只有在真核生物中,同一个 DNA 结合域(DBD)家族才会在一个多肽链中多次出现。这通过重复使用来自同一家族的 DBD 来潜在地增加 DNA 识别序列的长度和多样性。我们使用迄今为止最大的一组原核和真核基因组,检查了 TF 丰度随基因组中基因数量的增加而增加的情况。正如之前指出的,原核 TF 的增长速度快于线性。我们还观察到真核基因组中存在类似的关系,TF 的增长速度较慢。