Feltus F Alex, Lee Eva K, Costello Joseph F, Plass Christoph, Vertino Paula M
Department of Radiation Oncology and Winship Cancer Institute, Emory University School of Medicine, 1365-C Clifton Road. NE, Atlanta, GA 30322, USA.
Genomics. 2006 May;87(5):572-9. doi: 10.1016/j.ygeno.2005.12.016. Epub 2006 Feb 17.
Epigenetic silencing involving the aberrant methylation of promoter region CpG islands is widely recognized as a tumor suppressor silencing mechanism in cancer. However, the molecular pathways underlying aberrant DNA methylation remain elusive. Recently we showed that, on a genome-wide level, CpG island loci differ in their intrinsic susceptibility to aberrant methylation and that this susceptibility can be predicted based on underlying sequence context. These data suggest that there are sequence/structural features that contribute to the protection from or susceptibility to aberrant methylation. Here we use motif elicitation coupled with classification techniques to identify DNA sequence motifs that selectively define methylation-prone or methylation-resistant CpG islands. Motifs common to 28 methylation-prone or 47 methylation-resistant CpG island-containing genomic fragments were determined using the MEME and MAST algorithms (). The five most discriminatory motifs derived from methylation-prone sequences were found to be associated with CpG islands in general and were nonrandomly distributed throughout the genome. In contrast, the eight most discriminatory motifs derived from the methylation-resistant CpG islands were randomly distributed throughout the genome. Interestingly, this latter group tended to associate with Alu and other repetitive sequences. Used together, the frequency of occurrence of these motifs successfully discriminated methylation-prone and methylation-resistant CpG island groups with an accuracy of 87% after 10-fold cross-validation. The motifs identified here are candidate methylation-targeting or methylation-protection DNA sequences.
涉及启动子区域CpG岛异常甲基化的表观遗传沉默被广泛认为是癌症中的一种肿瘤抑制沉默机制。然而,异常DNA甲基化背后的分子途径仍然难以捉摸。最近我们发现,在全基因组水平上,CpG岛位点在其对异常甲基化的内在易感性方面存在差异,并且这种易感性可以基于潜在的序列背景进行预测。这些数据表明,存在有助于保护免受异常甲基化或使其易受异常甲基化影响的序列/结构特征。在这里,我们使用基序诱导结合分类技术来识别选择性定义易甲基化或抗甲基化CpG岛的DNA序列基序。使用MEME和MAST算法确定了28个易甲基化或47个含抗甲基化CpG岛的基因组片段共有的基序()。发现从易甲基化序列衍生的五个最具区分性的基序通常与CpG岛相关,并且在全基因组中呈非随机分布。相比之下,从抗甲基化CpG岛衍生的八个最具区分性的基序在全基因组中随机分布。有趣的是,后一组倾向于与Alu和其他重复序列相关。综合使用这些基序的出现频率,在10倍交叉验证后,成功区分易甲基化和抗甲基化CpG岛组的准确率达到87%。这里鉴定出的基序是候选的甲基化靶向或甲基化保护DNA序列。