Suvorova Inna A, Korostelev Yuri D, Gelfand Mikhail S
Research and Training Center on Bioinformatics, Institute for Information Transmission Problems RAS (The Kharkevich Institute), Moscow, Russia.
Research and Training Center on Bioinformatics, Institute for Information Transmission Problems RAS (The Kharkevich Institute), Moscow, Russia; Faculty of Bioengineering and Bioinformatics, Moscow State University, Moscow, Russia.
PLoS One. 2015 Jul 7;10(7):e0132618. doi: 10.1371/journal.pone.0132618. eCollection 2015.
The GNTR family of transcription factors (TFs) is a large group of proteins present in diverse bacteria and regulating various biological processes. Here we use the comparative genomics approach to reconstruct regulons and identify binding motifs of regulators from three subfamilies of the GNTR family, FADR, HUTC, and YTRA. Using these data, we attempt to predict DNA-protein contacts by analyzing correlations between binding motifs in DNA and amino acid sequences of TFs. We identify pairs of positions with high correlation between amino acids and nucleotides for FADR, HUTC, and YTRA subfamilies and show that the most predicted DNA-protein interactions are quite similar in all subfamilies and conform well to the experimentally identified contacts formed by FadR from E. coli and AraR from B. subtilis. The most frequent predicted contacts in the analyzed subfamilies are Arg-G, Asn-A, Asp-C. We also analyze the divergon structure and preferred site positions relative to regulated genes in the FADR and HUTC subfamilies. A single site in a divergon usually regulates both operons and is approximately in the middle of the intergenic area. Double sites are either involved in the co-operative regulation of both operons and then are in the center of the intergenic area, or each site in the pair independently regulates its own operon and tends to be near it. We also identify additional candidate TF-binding boxes near palindromic binding sites of TFs from the FADR, HUTC, and YTRA subfamilies, which may play role in the binding of additional TF-subunits.
转录因子(TFs)的GNTR家族是一大类存在于多种细菌中并调节各种生物学过程的蛋白质。在这里,我们使用比较基因组学方法来重建调控子,并鉴定GNTR家族三个亚家族FADR、HUTC和YTRA中调节因子的结合基序。利用这些数据,我们试图通过分析DNA中的结合基序与TFs氨基酸序列之间的相关性来预测DNA-蛋白质相互作用。我们确定了FADR、HUTC和YTRA亚家族中氨基酸与核苷酸之间具有高度相关性的位置对,并表明所有亚家族中大多数预测的DNA-蛋白质相互作用非常相似,并且与大肠杆菌的FadR和枯草芽孢杆菌的AraR实验鉴定的相互作用非常吻合。分析的亚家族中最常见的预测相互作用是Arg-G、Asn-A、Asp-C。我们还分析了FADR和HUTC亚家族中分歧结构以及相对于调控基因的首选位点位置。分歧中的单个位点通常调节两个操纵子,并且大约位于基因间区域的中间。双位点要么参与两个操纵子的协同调节,然后位于基因间区域的中心,要么该对中的每个位点独立调节其自己的操纵子并倾向于靠近它。我们还在FADR、HUTC和YTRA亚家族的TFs回文结合位点附近鉴定了额外的候选TF结合框,它们可能在额外TF亚基的结合中起作用。