Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong Special Administrative Region, China.
Department of Chemical Pathology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong Special Administrative Region, China.
Proc Natl Acad Sci U S A. 2021 Feb 2;118(5). doi: 10.1073/pnas.2019768118.
5-Methylcytosine (5mC) is an important type of epigenetic modification. Bisulfite sequencing (BS-seq) has limitations, such as severe DNA degradation. Using single molecule real-time sequencing, we developed a methodology to directly examine 5mC. This approach holistically examined kinetic signals of a DNA polymerase (including interpulse duration and pulse width) and sequence context for every nucleotide within a measurement window, termed the holistic kinetic (HK) model. The measurement window of each analyzed double-stranded DNA molecule comprised 21 nucleotides with a cytosine in a CpG site in the center. We used amplified DNA (unmethylated) and M.SssI-treated DNA (methylated) (M.SssI being a CpG methyltransferase) to train a convolutional neural network. The area under the curve for differentiating methylation states using such samples was up to 0.97. The sensitivity and specificity for genome-wide 5mC detection at single-base resolution reached 90% and 94%, respectively. The HK model was then tested on human-mouse hybrid fragments in which each member of the hybrid had a different methylation status. The model was also tested on human genomic DNA molecules extracted from various biological samples, such as buffy coat, placental, and tumoral tissues. The overall methylation levels deduced by the HK model were well correlated with those by BS-seq ( = 0.99; < 0.0001) and allowed the measurement of allele-specific methylation patterns in imprinted genes. Taken together, this methodology has provided a system for simultaneous genome-wide genetic and epigenetic analyses.
5- 甲基胞嘧啶(5mC)是一种重要的表观遗传修饰类型。亚硫酸氢盐测序(BS-seq)存在局限性,例如严重的 DNA 降解。我们使用单分子实时测序开发了一种直接检测 5mC 的方法。这种方法全面检查了 DNA 聚合酶的动力学信号(包括脉冲间隔和脉冲宽度)以及测量窗口内每个核苷酸的序列背景,称为整体动力学(HK)模型。每个分析的双链 DNA 分子的测量窗口包含 21 个核苷酸,其中中心是 CpG 位点的胞嘧啶。我们使用扩增 DNA(未甲基化)和 M.SssI 处理的 DNA(甲基化)(M.SssI 是一种 CpG 甲基转移酶)来训练卷积神经网络。使用此类样本区分甲基化状态的曲线下面积高达 0.97。在单碱基分辨率下进行全基因组 5mC 检测的灵敏度和特异性分别达到 90%和 94%。然后,在人 - 鼠杂种片段中测试了 HK 模型,其中杂种的每个成员具有不同的甲基化状态。该模型还在从各种生物样本(如白细胞、胎盘和肿瘤组织)提取的人基因组 DNA 分子上进行了测试。HK 模型推断的总体甲基化水平与 BS-seq 高度相关(= 0.99;< 0.0001),并允许测量印迹基因中的等位基因特异性甲基化模式。总之,该方法为同时进行全基因组遗传和表观遗传分析提供了系统。