Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA.
Genome Res. 2018 Apr;28(4):497-508. doi: 10.1101/gr.229518.117. Epub 2018 Mar 21.
General regulatory factors (GRFs), such as Reb1, Abf1, Rap1, Mcm1, and Cbf1, positionally organize yeast chromatin through interactions with a core consensus DNA sequence. It is assumed that sequence recognition via direct base readout suffices for specificity and that spurious nonfunctional sites are rendered inaccessible by chromatin. We tested these assumptions through genome-wide mapping of GRFs in vivo and in purified biochemical systems at near-base pair (bp) resolution using several ChIP-exo-based assays. We find that computationally predicted DNA shape features (e.g., minor groove width, helix twist, base roll, and propeller twist) that are not defined by a unique consensus sequence are embedded in the nonunique portions of GRF motifs and contribute critically to sequence-specific binding. This dual source specificity occurs at GRF sites in promoter regions where chromatin organization starts. Outside of promoter regions, strong consensus sites lack the shape component and consequently lack an intrinsic ability to bind cognate GRFs, without regard to influences from chromatin. However, sites having a weak consensus and low intrinsic affinity do exist in these regions but are rendered inaccessible in a chromatin environment. Thus, GRF site-specificity is achieved through integration of favorable DNA sequence and shape readouts in promoter regions and by chromatin-based exclusion from fortuitous weak sites within gene bodies. This study further revealed a severe G/C nucleotide cross-linking selectivity inherent in all formaldehyde-based ChIP assays, which includes ChIP-seq. However, for most tested proteins, G/C selectivity did not appreciably affect binding site detection, although it does place limits on the quantitativeness of occupancy levels.
一般调控因子(GRFs),如 Reb1、Abf1、Rap1、Mcm1 和 Cbf1,通过与核心一致序列的相互作用来定位酵母染色质。人们认为,通过直接碱基读取进行序列识别足以保证特异性,并且非功能的随机位点会被染色质拒之门外。我们通过在体内和纯化的生化系统中使用几种基于 ChIP-exo 的测定方法,以接近碱基对(bp)的分辨率,对 GRF 的全基因组作图进行了测试,以检验这些假设。我们发现,计算预测的 DNA 形状特征(例如,小沟宽度、螺旋扭曲、碱基滚动和螺旋桨扭曲),这些特征不是由独特的一致序列定义的,而是嵌入在 GRF 基序的非独特部分,并对序列特异性结合至关重要。这种双重来源的特异性发生在染色质组织开始的启动子区域的 GRF 位点。在启动子区域之外,强一致序列缺乏形状成分,因此缺乏与同源 GRF 结合的内在能力,而不考虑染色质的影响。然而,在这些区域确实存在具有弱一致序列和低内在亲和力的位点,但在染色质环境中它们是不可访问的。因此,GRF 的特异性是通过在启动子区域中整合有利的 DNA 序列和形状读取以及通过染色质排除基因体中偶然的弱位点来实现的。这项研究进一步揭示了所有基于甲醛的 ChIP 测定法(包括 ChIP-seq)中固有的严重 G/C 核苷酸交联选择性,尽管它确实限制了占据水平的定量程度,但对于大多数测试的蛋白质,G/C 选择性并没有明显影响结合位点的检测。