State Key Laboratory for Agrobiotechnology, Key Laboratory of Crop Heterosis and Utilization, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China.
State Key Laboratory of Experimental Hematology, Institute of Hematology and Blood Disease Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin 300020, China.
Bioinformatics. 2018 Feb 1;34(3):381-387. doi: 10.1093/bioinformatics/btx595.
DNA methylation is important for gene silencing and imprinting in both plants and animals. Recent advances in bisulfite sequencing allow detection of single nucleotide variations (SNVs) achieving high sensitivity, but accurately identifying heterozygous SNVs from partially C-to-T converted sequences remains challenging.
We designed two methods, BayesWC and BinomWC, that substantially improved the precision of heterozygous SNV calls from ∼80% to 99% while retaining comparable recalls. With these SNV calls, we provided functions for allele-specific DNA methylation (ASM) analysis and visualizing the methylation status on reads. Applying ASM analysis to a previous dataset, we found that an average of 1.5% of investigated regions showed allelic methylation, which were significantly enriched in transposon elements and likely to be shared by the same cell-type. A dynamic fragment strategy was utilized for DMR analysis in low-coverage data and was able to find differentially methylated regions (DMRs) related to key genes involved in tumorigenesis using a public cancer dataset. Finally, we integrated 40 applications into the software package CGmapTools to analyze DNA methylomes. This package uses CGmap as the format interface, and designs binary formats to reduce the file size and support fast data retrieval, and can be applied for context-wise, gene-wise, bin-wise, region-wise and sample-wise analyses and visualizations.
The CGmapTools software is freely available at https://cgmaptools.github.io/.
Supplementary data are available at Bioinformatics online.
DNA 甲基化在植物和动物中对于基因沉默和印迹都很重要。最近亚硫酸氢盐测序技术的进步允许检测单核苷酸变异(SNV),从而实现了高灵敏度,但准确识别部分 C 到 T 转换序列中的杂合 SNV 仍然具有挑战性。
我们设计了两种方法,BayesWC 和 BinomWC,这两种方法大大提高了杂合 SNV 调用的精度,从约 80%提高到 99%,同时保持了相当的召回率。有了这些 SNV 调用,我们提供了用于等位基因特异性 DNA 甲基化(ASM)分析和读取上甲基化状态可视化的功能。将 ASM 分析应用于之前的数据集,我们发现约 1.5%的研究区域表现出等位基因甲基化,这些区域在转座元件中显著富集,并且可能被同一细胞类型共享。在低覆盖数据中,我们利用动态片段策略进行 DMR 分析,并能够利用公共癌症数据集找到与肿瘤发生相关的关键基因的差异甲基化区域(DMRs)。最后,我们将 40 个应用程序集成到 CGmapTools 软件包中,以分析 DNA 甲基组。该软件包使用 CGmap 作为格式接口,并设计了二进制格式以减小文件大小并支持快速数据检索,可用于上下文、基因、bin、区域和样本分析和可视化。
CGmapTools 软件可在 https://cgmaptools.github.io/ 免费获得。
补充数据可在 Bioinformatics 在线获得。