Minezaki Yoshiaki, Homma Keiichi, Nishikawa Ken
Laboratory of Gene-Product Informatics, Center for Information Biology-DNA Data Bank of Japan, National Institute of Genetics, Research Organization of Information and Systems, Yata, Mishima, Shizuoka, Japan.
DNA Res. 2005;12(5):269-80. doi: 10.1093/dnares/dsi016. Epub 2006 Jan 10.
Assignment of all transcription factors (TFs) from genome sequence data is not a straightforward task due to the wide variation in TFs among different species. A DNA binding domain (DBD) and a contiguous non-DBD with a characteristic SCOP or Pfam domain combination are observed in most members of TF families. We found that most of the experimentally verified TFs in prokaryotes are detectable by a combination of SCOP or Pfam domains assigned to DBDs and non-DBDs. Based on this finding, we set up rules to detect TFs and classify them into 52 TF families. Application of the rules to 154 entirely sequenced prokaryotic genomes detected >18,000 TFs classified into families, which have been made publicly available from the 'GTOP_TF' database. Despite the rough proportionality of the number of TFs per genome with genome size, species with reduced genomes, i.e. obligatory parasites and symbionts, have only a few if any TFs, reflecting a nearly complete loss. Also the number of TFs is significantly lower in archaea than in bacteria. In addition, all but 1 of the 19 TF families present in archaea is present in bacteria, whereas 33 TF families are found exclusively in bacteria. This observation indicates that a number of new TF families have evolved in bacteria, making the transcription regulatory system more divergent in bacteria than in archaea.
由于不同物种间转录因子(TFs)差异巨大,从基因组序列数据中确定所有转录因子并非易事。在TF家族的大多数成员中,可观察到一个DNA结合结构域(DBD)和一个具有特征性SCOP或Pfam结构域组合的相邻非DBD。我们发现,原核生物中大多数经过实验验证的TFs可通过分配给DBD和非DBD的SCOP或Pfam结构域组合来检测。基于这一发现,我们制定了检测TFs的规则,并将它们分为52个TF家族。将这些规则应用于154个全基因组测序的原核生物基因组,检测到超过18,000个分类到家族中的TFs,这些数据已通过“GTOP_TF”数据库公开提供。尽管每个基因组中TFs的数量与基因组大小大致成比例,但基因组缩小的物种,即专性寄生虫和共生体,即使有TFs也很少,这反映出它们几乎完全丧失了TFs。此外,古菌中的TFs数量明显低于细菌。此外,古菌中存在的19个TF家族中,除了1个之外,其余的在细菌中都存在,而有33个TF家族仅在细菌中发现。这一观察结果表明,细菌中已经进化出了一些新的TF家族,使得细菌中的转录调控系统比古菌中的更加多样化。