Levati Elisabetta, Sartini Sara, Ottonello Simone, Montanini Barbara
Biochemistry and Molecular Biology Unit, Laboratory of Functional Genomics and Protein Engineering, Department of Life Sciences, University of Parma, Parco Area delle Scienze 23/A, 43124 Parma, Italy.
Comput Struct Biotechnol J. 2016 Jun 29;14:262-70. doi: 10.1016/j.csbj.2016.06.004. eCollection 2016.
Transcription factors (TFs) are master gene products that regulate gene expression in response to a variety of stimuli. They interact with DNA in a sequence-specific manner using a variety of DNA-binding domain (DBD) modules. This allows to properly position their second domain, called "effector domain", to directly or indirectly recruit positively or negatively acting co-regulators including chromatin modifiers, thus modulating preinitiation complex formation as well as transcription elongation. At variance with the DBDs, which are comprised of well-defined and easily recognizable DNA binding motifs, effector domains are usually much less conserved and thus considerably more difficult to predict. Also not so easy to identify are the DNA-binding sites of TFs, especially on a genome-wide basis and in the case of overlapping binding regions. Another emerging issue, with many potential regulatory implications, is that of so-called "moonlighting" transcription factors, i.e., proteins with an annotated function unrelated to transcription and lacking any recognizable DBD or effector domain, that play a role in gene regulation as their second job. Starting from bioinformatic and experimental high-throughput tools for an unbiased, genome-wide identification and functional characterization of TFs (especially transcriptional activators), we describe both established (and usually well affordable) as well as newly developed platforms for DNA-binding site identification. Selected combinations of these search tools, some of which rely on next-generation sequencing approaches, allow delineating the entire repertoire of TFs and unconventional regulators encoded by the any sequenced genome.
转录因子(TFs)是一类主要的基因产物,可响应多种刺激来调节基因表达。它们利用多种DNA结合域(DBD)模块以序列特异性方式与DNA相互作用。这使得它们能够正确定位其第二个结构域,即“效应结构域”,以直接或间接招募包括染色质修饰剂在内的正向或负向作用的共调节因子,从而调节起始前复合物的形成以及转录延伸。与由明确且易于识别的DNA结合基序组成的DBD不同,效应结构域通常保守性低得多,因此更难预测。TFs的DNA结合位点也不那么容易识别,尤其是在全基因组范围内以及存在重叠结合区域的情况下。另一个新出现的问题,具有许多潜在的调控意义,是所谓的“兼职”转录因子,即具有与转录无关的注释功能且缺乏任何可识别的DBD或效应结构域的蛋白质,它们作为第二功能在基因调控中发挥作用。从用于对TFs(尤其是转录激活因子)进行无偏倚的全基因组鉴定和功能表征的生物信息学和实验高通量工具出发,我们描述了用于DNA结合位点鉴定的既定(且通常成本较低)以及新开发的平台。这些搜索工具的选定组合,其中一些依赖于下一代测序方法,能够描绘出任何已测序基因组所编码的TFs和非常规调节因子的完整目录。