HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.
Department of Biological Sciences, The University of Alabama in Huntsville, Huntsville, AL, USA.
Nature. 2020 Jul;583(7818):720-728. doi: 10.1038/s41586-020-2023-4. Epub 2020 Jul 29.
Transcription factors are DNA-binding proteins that have key roles in gene regulation. Genome-wide occupancy maps of transcriptional regulators are important for understanding gene regulation and its effects on diverse biological processes. However, only a minority of the more than 1,600 transcription factors encoded in the human genome has been assayed. Here we present, as part of the ENCODE (Encyclopedia of DNA Elements) project, data and analyses from chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) experiments using the human HepG2 cell line for 208 chromatin-associated proteins (CAPs). These comprise 171 transcription factors and 37 transcriptional cofactors and chromatin regulator proteins, and represent nearly one-quarter of CAPs expressed in HepG2 cells. The binding profiles of these CAPs form major groups associated predominantly with promoters or enhancers, or with both. We confirm and expand the current catalogue of DNA sequence motifs for transcription factors, and describe motifs that correspond to other transcription factors that are co-enriched with the primary ChIP target. For example, FOX family motifs are enriched in ChIP-seq peaks of 37 other CAPs. We show that motif content and occupancy patterns can distinguish between promoters and enhancers. This catalogue reveals high-occupancy target regions at which many CAPs associate, although each contains motifs for only a minority of the numerous associated transcription factors. These analyses provide a more complete overview of the gene regulatory networks that define this cell type, and demonstrate the usefulness of the large-scale production efforts of the ENCODE Consortium.
转录因子是能与 DNA 结合的蛋白质,在基因调控中发挥着关键作用。全基因组转录调控因子的结合图谱对于理解基因调控及其对各种生物过程的影响非常重要。然而,人类基因组中编码的 1600 多种转录因子中,只有少数几种得到了检测。在这里,我们作为 ENCODE(DNA 元件百科全书)项目的一部分,展示了使用人类 HepG2 细胞系进行的 208 种染色质相关蛋白(CAP)的染色质免疫沉淀 followed by high-throughput sequencing(ChIP-seq)实验的结果和分析。这些蛋白包括 171 种转录因子和 37 种转录共因子和染色质调节蛋白,它们代表了 HepG2 细胞中表达的 CAP 的近四分之一。这些 CAP 的结合谱形成了主要的组,与启动子或增强子相关,或者与两者都相关。我们证实并扩展了当前转录因子 DNA 序列基序的目录,并描述了与主要 ChIP 靶标共同富集的其他转录因子相对应的基序。例如,FOX 家族基序在 37 种其他 CAP 的 ChIP-seq 峰中富集。我们表明,基序内容和占据模式可以区分启动子和增强子。这个目录揭示了许多 CAP 结合的高占据靶区,尽管每个靶区只包含少数相关转录因子的基序。这些分析提供了对定义这种细胞类型的基因调控网络的更全面的概述,并展示了 ENCODE 联盟的大规模生产努力的有用性。