National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
Genome Res. 2023 Oct;33(10):1662-1672. doi: 10.1101/gr.278130.123. Epub 2023 Oct 26.
Housekeeping genes are considered to be regulated by common enhancers across different tissues. Here we report that most of the commonly expressed mouse or human genes across different cell types, including more than half of the previously identified housekeeping genes, are associated with cell type-specific enhancers. Furthermore, the binding of most transcription factors (TFs) is cell type-specific. We reason that these cell type specificities are causally related to the collective TF recruitment at regulatory sites, as TFs tend to bind to regions associated with many other TFs and each cell type has a unique repertoire of expressed TFs. Based on binding profiles of hundreds of TFs from HepG2, K562, and GM12878 cells, we show that 80% of all TF peaks overlapping H3K27ac signals are in the top 20,000-23,000 most TF-enriched H3K27ac peak regions, and approximately 12,000-15,000 of these peaks are enhancers (nonpromoters). Those enhancers are mainly cell type-specific and include those linked to the majority of commonly expressed genes. Moreover, we show that the top 15,000 most TF-enriched regulatory sites in HepG2 cells, associated with about 200 TFs, can be predicted largely from the binding profile of as few as 30 TFs. Through motif analysis, we show that major enhancers harbor diverse and clustered motifs from a combination of available TFs uniquely present in each cell type. We propose a mechanism that explains how the highly focused TF binding at regulatory sites results in cell type specificity of enhancers for housekeeping and commonly expressed genes.
管家基因被认为是由不同组织中的共同增强子调控的。在这里,我们报告说,大多数在不同细胞类型中共同表达的小鼠或人类基因,包括之前鉴定的大多数管家基因,都与细胞类型特异性增强子相关。此外,大多数转录因子(TFs)的结合是细胞类型特异性的。我们推断这些细胞类型特异性与调节位点的集体 TF 募集有关,因为 TFs 倾向于与与许多其他 TFs 相关的区域结合,并且每种细胞类型都具有独特的表达 TFs 谱。基于 HepG2、K562 和 GM12878 细胞中数百个 TF 的结合谱,我们表明,重叠 H3K27ac 信号的所有 TF 峰的 80%都位于前 20,000-23,000 个 TF 最富集的 H3K27ac 峰区域中,并且大约 12,000-15,000 个峰是增强子(非启动子)。这些增强子主要是细胞类型特异性的,包括与大多数共同表达基因相关的增强子。此外,我们表明,HepG2 细胞中与大约 200 个 TF 相关的前 15,000 个 TF 最富集的调控位点,可通过少至 30 个 TF 的结合谱进行很大程度的预测。通过基序分析,我们表明,主要增强子包含来自每个细胞类型中独特存在的可用 TF 的组合的多样化和聚类基序。我们提出了一种机制,解释了在调节位点高度集中的 TF 结合如何导致管家基因和共同表达基因的增强子的细胞类型特异性。