Rozenberg Julian M, Shlyakhtenko Andrey, Glass Kimberly, Rishi Vikas, Myakishev Maxim V, FitzGerald Peter C, Vinson Charles
Laboratory of Metabolism, National Cancer Institute, Bethesda, MD 20892, USA.
BMC Genomics. 2008 Feb 5;9:67. doi: 10.1186/1471-2164-9-67.
The promoters of housekeeping genes are well-bound by RNA polymerase II (RNAP) in different tissues. Although the promoters of these genes are known to contain CpG islands, the specific DNA sequences that are associated with high RNAP binding to housekeeping promoters has not been described.
ChIP-chip experiments from three mouse tissues, liver, heart ventricles, and primary keratinocytes, indicate that 94% of promoters have similar RNAP binding, ranging from well-bound to poorly-bound in all tissues. Using all 8-base pair long sequences as a test set, we have identified the DNA sequences that are enriched in promoters of housekeeping genes, focusing on those DNA sequences which are preferentially localized in the proximal promoter. We observe a bimodal distribution. Virtually all sequences enriched in promoters with high RNAP binding values contain a CpG dinucleotide. These results suggest that only transcription factor binding sites (TFBS) that contain the CpG dinucleotide are involved in RNAP binding to housekeeping promoters while TFBS that do not contain a CpG are involved in regulated promoter activity. Abundant 8-mers that are preferentially localized in the proximal promoters and exhibit the best enrichment in RNAP bound promoters are all variants of six known CpG-containing TFBS: ETS, NRF-1, BoxA, SP1, CRE, and E-Box. The frequency of these six DNA motifs can predict housekeeping promoters as accurately as the presence of a CpG island, suggesting that they are the structural elements critical for CpG island function. Experimental EMSA results demonstrate that methylation of the CpG in the ETS, NRF-1, and SP1 motifs prevent DNA binding in nuclear extracts in both keratinocytes and liver.
In general, TFBS that do not contain a CpG are involved in regulated gene expression while TFBS that contain a CpG are involved in constitutive gene expression with some CpG containing sequences also involved in inducible and tissue specific gene regulation. These TFBS are not bound when the CpG is methylated. Unmethylated CpG dinucleotides in the TFBS in CpG islands allow the transcription factors to find their binding sites which occur only in promoters, in turn localizing RNAP to promoters.
管家基因的启动子在不同组织中均能与RNA聚合酶II(RNAP)紧密结合。尽管已知这些基因的启动子含有CpG岛,但与RNAP高结合于管家基因启动子相关的特定DNA序列尚未见报道。
来自小鼠肝脏、心室和原代角质形成细胞这三种组织的芯片免疫沉淀实验表明,94%的启动子在所有组织中的RNAP结合情况相似,从结合良好到结合较差不等。以所有8碱基对长的序列作为测试集,我们鉴定出了管家基因启动子中富集的DNA序列,重点关注那些优先定位在近端启动子中的DNA序列。我们观察到一种双峰分布。实际上,所有在具有高RNAP结合值的启动子中富集的序列都包含一个CpG二核苷酸。这些结果表明,只有包含CpG二核苷酸的转录因子结合位点(TFBS)参与RNAP与管家基因启动子的结合,而不包含CpG的TFBS则参与调控启动子活性。优先定位在近端启动子中且在RNAP结合的启动子中表现出最佳富集的丰富8聚体都是六种已知的含CpG的TFBS的变体:ETS、NRF-1、BoxA、SP1、CRE和E-Box。这六种DNA基序的频率预测管家基因启动子时的准确性与CpG岛的存在情况相当,表明它们是对CpG岛功能至关重要的结构元件。实验性电泳迁移率变动分析结果表明,ETS、NRF-1和SP1基序中CpG的甲基化会阻止角质形成细胞和肝脏核提取物中的DNA结合。
一般来说,不包含CpG的TFBS参与调控基因表达,而包含CpG的TFBS参与组成型基因表达,一些含CpG的序列也参与诱导型和组织特异性基因调控。当CpG甲基化时,这些TFBS不被结合。CpG岛中TFBS内未甲基化的CpG二核苷酸允许转录因子找到仅在启动子中出现的结合位点,进而将RNAP定位到启动子上。