Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California, USA.
Proc Natl Acad Sci U S A. 2012 May 8;109(19):7332-7. doi: 10.1073/pnas.1201310109. Epub 2012 Apr 20.
DNA methylation mediates imprinted gene expression by passing an epigenomic state across generations and differentially marking specific regulatory regions on maternal and paternal alleles. Imprinting has been tied to the evolution of the placenta in mammals and defects of imprinting have been associated with human diseases. Although recent advances in genome sequencing have revolutionized the study of DNA methylation, existing methylome data remain largely untapped in the study of imprinting. We present a statistical model to describe allele-specific methylation (ASM) in data from high-throughput short-read bisulfite sequencing. Simulation results indicate technical specifications of existing methylome data, such as read length and coverage, are sufficient for full-genome ASM profiling based on our model. We used our model to analyze methylomes for a diverse set of human cell types, including cultured and uncultured differentiated cells, embryonic stem cells and induced pluripotent stem cells. Regions of ASM identified most consistently across methylomes are tightly connected with known imprinted genes and precisely delineate the boundaries of several known imprinting control regions. Predicted regions of ASM common to multiple cell types frequently mark noncoding RNA promoters and represent promising starting points for targeted validation. More generally, our model provides the analytical complement to cutting-edge experimental technologies for surveying ASM in specific cell types and across species.
DNA 甲基化通过在代际之间传递表观基因组状态,并在母本和父本等位基因上的特定调控区域进行差异标记,从而介导印迹基因的表达。印迹与哺乳动物胎盘的进化有关,印迹缺陷与人类疾病有关。尽管基因组测序的最新进展彻底改变了 DNA 甲基化的研究,但印迹研究中仍然在很大程度上未利用现有的甲基组数据。我们提出了一个统计模型来描述高通量短读测序的双硫代修饰数据中的等位基因特异性甲基化(ASM)。模拟结果表明,基于我们的模型,现有甲基组数据的技术规格(如读取长度和覆盖度)足以进行全基因组 ASM 分析。我们使用我们的模型分析了一组多样化的人类细胞类型的甲基组,包括培养和未培养的分化细胞、胚胎干细胞和诱导多能干细胞。在甲基组中最一致地识别出的 ASM 区域与已知的印迹基因紧密相连,并精确划定了几个已知的印迹控制区域的边界。多个细胞类型共有的预测 ASM 区域经常标记非编码 RNA 启动子,并且是针对特定细胞类型和跨物种进行 ASM 靶向验证的有希望的起点。更一般地说,我们的模型为在特定细胞类型和跨物种中调查 ASM 的前沿实验技术提供了分析补充。