Nagy Gergely, Nagy Laszlo
Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, HU 4032, Hungary.
Johns Hopkins University School of Medicine, Departments of Medicine and Biological Chemistry, Institute for Fundamental Biomedical Research, Johns Hopkins All Children's Hospital, Saint Petersburg, FL 33701, USA.
Comput Struct Biotechnol J. 2020 Jul 18;18:2026-2032. doi: 10.1016/j.csbj.2020.07.007. eCollection 2020.
Collaboration of transcription factors (TFs) and their recognition motifs in DNA is the result of coevolution and forms the basis of gene regulation. However, the way how these short genomic sequences contribute to setting the level of gene products is not understood in sufficient detail. The biological problem to be solved by the cell is complex, because each gene requires a unique regulatory network in each cellular condition using the same genome. Thus far, only some components of these networks have been uncovered. In this review, we compiled the features and principles of the motif grammar, which dictates the characteristics and thus the likelihood of the interactions of the binding TFs and their coregulators. We present how sequence features provide specificity using, as examples, two major TF superfamilies, the bZIP proteins and nuclear receptors. We also discuss the phenomenon of "weak" (low affinity) binding sites, which appear to be components of several important genomic regulatory regions, but paradoxically are barely detectable by the currently used approaches. Assembling the complete set of regulatory regions composed of both weak and strong binding sites will allow one to get more comprehensive lists of factors playing roles in gene regulation, thus making possible the deeper understanding of regulatory networks.
转录因子(TFs)与其在DNA中的识别基序之间的协作是共同进化的结果,构成了基因调控的基础。然而,这些短基因组序列如何影响基因产物水平的具体方式仍未得到充分了解。细胞需要解决的生物学问题很复杂,因为在每种细胞状态下,每个基因都要利用相同的基因组构建独特的调控网络。到目前为止,这些网络中只有部分组件被发现。在这篇综述中,我们梳理了基序语法的特征和原则,这些特征和原则决定了结合转录因子及其共调节因子相互作用的特性和可能性。我们以两个主要的转录因子超家族,即碱性亮氨酸拉链(bZIP)蛋白和核受体为例,展示了序列特征是如何提供特异性的。我们还讨论了“弱”(低亲和力)结合位点的现象,这些位点似乎是几个重要基因组调控区域的组成部分,但矛盾的是,目前使用的方法几乎检测不到它们。整合由弱结合位点和强结合位点组成的完整调控区域集,将使人们能够获得在基因调控中发挥作用的更全面的因子列表,从而更深入地理解调控网络。