The South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.
State Key Laboratory of Agrobiotechnology, School of Life Sciences, The Chinese University of Hong Kong, Hong Kong, China.
Nat Commun. 2022 Dec 9;13(1):7632. doi: 10.1038/s41467-022-35438-4.
Non-coding cis-regulatory variants in animal genomes are an important driving force in the evolution of transcription regulation and phenotype diversity. However, cistrome dynamics in plants remain largely underexplored. Here, we compare the binding of GOLDEN2-LIKE (GLK) transcription factors in tomato, tobacco, Arabidopsis, maize and rice. Although the function of GLKs is conserved, most of their binding sites are species-specific. Conserved binding sites are often found near photosynthetic genes dependent on GLK for expression, but sites near non-differentially expressed genes in the glk mutant are nevertheless under purifying selection. The binding sites' regulatory potential can be predicted by machine learning model using quantitative genome features and TF co-binding information. Our study show that genome cis-variation caused wide-spread TF binding divergence, and most of the TF binding sites are genetically redundant. This poses a major challenge for interpreting the effect of individual sites and highlights the importance of quantitatively measuring TF occupancy.
动物基因组中的非编码顺式调控变异是转录调控和表型多样性进化的重要驱动力。然而,植物的顺式作用元件动态仍在很大程度上未被探索。在这里,我们比较了番茄、烟草、拟南芥、玉米和水稻中 GLK 转录因子的结合情况。尽管 GLK 的功能是保守的,但它们的大多数结合位点是物种特异性的。保守的结合位点通常存在于依赖 GLK 表达的光合作用基因附近,但在 glk 突变体中,非差异表达基因附近的结合位点仍然受到纯化选择。可以使用定量基因组特征和 TF 共结合信息的机器学习模型来预测结合位点的调控潜力。我们的研究表明,基因组顺式变异导致了广泛的 TF 结合分歧,并且大多数 TF 结合位点在遗传上是冗余的。这给解释单个位点的影响带来了重大挑战,凸显了定量测量 TF 占据的重要性。