Department of Molecular and Cellular Biology, Division of Biology, Beckman Research Institute, City of Hope, Duarte, CA 91010, USA.
Proc Natl Acad Sci U S A. 2010 Aug 31;107(35):15485-90. doi: 10.1073/pnas.1010506107. Epub 2010 Aug 17.
CpG dinucleotides contribute to epigenetic mechanisms by being the only site for DNA methylation in mammalian somatic cells. They are also mutation hotspots and approximately 5-fold depleted genome-wide. We report here a study focused on CpG sites in the coding regions of Hox and other transcription factor genes, comparing methylated genomes of Homo sapiens, Mus musculus, and Danio rerio with nonmethylated genomes of Drosophila melanogaster and Caenorhabditis elegans. We analyzed 4-fold degenerate, synonymous codons with the potential for CpG. That is, we studied "silent" changes that do not affect protein products but could damage epigenetic marking. We find that DNA-binding transcription factors and other developmentally relevant genes show, only in methylated genomes, a bimodal distribution of CpG usage. Several genetic code-based tests indicate, again for methylated genomes only, that the frequency of silent CpGs in Hox genes is much greater than expectation. Also informative are NCG-GNN and NCC-GNN codon doublets, for which an unusually high rate of G to C and C to G transversions was observed at the third (silent) position of the first codon. Together these results are interpreted as evidence for strong "pro-epigenetic" selection acting to preserve CpG sites in coding regions of many genes controlling development. We also report that DNA-binding transcription factors and developmentally important genes are dramatically overrepresented in or near clusters of three or more CpG islands, suggesting a possible relationship between evolutionary preservation of CpG dinucleotides in both coding regions and CpG islands.
CpG 二核苷酸是哺乳动物体细胞中唯一的 DNA 甲基化位点,通过这种方式参与表观遗传机制。它们也是突变热点,在全基因组中大约减少了 5 倍。我们在此报告了一项专注于 Hox 和其他转录因子基因编码区 CpG 位点的研究,将智人、小家鼠和斑马鱼的甲基化基因组与果蝇和秀丽隐杆线虫的非甲基化基因组进行了比较。我们分析了具有 CpG 潜在可能性的 4 倍简并、同义密码子。也就是说,我们研究了不会影响蛋白质产物但可能破坏表观遗传标记的“沉默”变化。我们发现,DNA 结合转录因子和其他与发育相关的基因在甲基化基因组中仅显示 CpG 使用的双峰分布。基于遗传密码的几种测试表明,仅在甲基化基因组中,Hox 基因中的沉默 CpG 频率远远超过预期。NCG-GNN 和 NCC-GNN 密码子二联体也很有启发性,对于这些二联体,在第一个密码子的第三个(沉默)位置观察到 G 到 C 和 C 到 G 颠换的异常高频率。这些结果一起解释为强烈的“亲表观遗传”选择作用于许多控制发育的基因的编码区中保留 CpG 位点的证据。我们还报告称,DNA 结合转录因子和发育重要基因在三个或更多 CpG 岛的簇中或附近显著过表达,这表明在编码区和 CpG 岛中 CpG 二核苷酸的进化保存之间可能存在某种关系。