School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA.
Mol Biol Evol. 2021 Aug 23;38(9):3898-3909. doi: 10.1093/molbev/msab085.
Enhancers are often studied as noncoding regulatory elements that modulate the precise spatiotemporal expression of genes in a highly tissue-specific manner. This paradigm has been challenged by recent evidence of individual enhancers acting in multiple tissues or developmental contexts. However, the frequency of these enhancers with high degrees of "pleiotropy" out of all putative enhancers is not well understood. Consequently, it is unclear how the variation of enhancer pleiotropy corresponds to the variation in expression breadth of target genes. Here, we use multi-tissue chromatin maps from diverse human tissues to investigate the enhancer-gene interaction architecture while accounting for 1) the distribution of enhancer pleiotropy, 2) the variations of regulatory links from enhancers to target genes, and 3) the expression breadth of target genes. We show that most enhancers are tissue-specific and that highly pleiotropy enhancers account for <1% of all putative regulatory sequences in the human genome. Notably, several genomic features are indicative of increasing enhancer pleiotropy, including longer sequence length, greater number of links to genes, increasing abundance and diversity of encoded transcription factor motifs, and stronger evolutionary conservation. Intriguingly, the number of enhancers per gene remains remarkably consistent for all genes (∼14). However, enhancer pleiotropy does not directly translate to the expression breadth of target genes. We further present a series of Gaussian Mixture Models to represent this organization architecture. Consequently, we demonstrate that a modest trend of more pleiotropic enhancers targeting more broadly expressed genes can generate the observed diversity of expression breadths in the human genome.
增强子通常被研究为非编码调控元件,以高度组织特异性的方式调节基因的精确时空表达。这一范式受到了最近的证据的挑战,即单个增强子在多个组织或发育环境中发挥作用。然而,具有高度“多效性”的这些增强子在所有假定的增强子中的频率尚不清楚。因此,不清楚增强子多效性的变化如何与靶基因表达广度的变化相对应。在这里,我们使用来自不同人类组织的多组织染色质图谱来研究增强子-基因相互作用的结构,同时考虑到 1)增强子多效性的分布,2)从增强子到靶基因的调控联系的变化,以及 3)靶基因的表达广度。我们表明,大多数增强子是组织特异性的,高度多效性的增强子仅占人类基因组中所有假定调控序列的<1%。值得注意的是,几个基因组特征表明增强子的多效性增加,包括更长的序列长度、与基因的链接数量增加、编码转录因子基序的丰度和多样性增加,以及更强的进化保守性。有趣的是,每个基因的增强子数量对于所有基因来说仍然非常一致(约为 14)。然而,增强子的多效性并不能直接转化为靶基因的表达广度。我们进一步提出了一系列高斯混合模型来表示这种组织架构。因此,我们证明了一个适度的趋势,即更多的多效性增强子靶向更广泛表达的基因,可以产生人类基因组中观察到的表达广度的多样性。