School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China.
Nat Commun. 2021 Oct 13;12(1):5976. doi: 10.1038/s41467-021-26278-9.
In plants, cytosine DNA methylations (5mCs) can happen in three sequence contexts as CpG, CHG, and CHH (where H = A, C, or T), which play different roles in the regulation of biological processes. Although long Nanopore reads are advantageous in the detection of 5mCs comparing to short-read bisulfite sequencing, existing methods can only detect 5mCs in the CpG context, which limits their application in plants. Here, we develop DeepSignal-plant, a deep learning tool to detect genome-wide 5mCs of all three contexts in plants from Nanopore reads. We sequence Arabidopsis thaliana and Oryza sativa using both Nanopore and bisulfite sequencing. We develop a denoising process for training models, which enables DeepSignal-plant to achieve high correlations with bisulfite sequencing for 5mC detection in all three contexts. Furthermore, DeepSignal-plant can profile more 5mC sites, which will help to provide a more complete understanding of epigenetic mechanisms of different biological processes.
在植物中,胞嘧啶 DNA 甲基化(5mCs)可以在三个序列环境中发生,即 CpG、CHG 和 CHH(其中 H=A、C 或 T),它们在调节生物过程中发挥不同的作用。尽管与短读序的亚硫酸氢盐测序相比,长的纳米孔读取在检测 5mCs 方面具有优势,但现有的方法只能检测 CpG 环境中的 5mCs,这限制了它们在植物中的应用。在这里,我们开发了 DeepSignal-plant,这是一种深度学习工具,可以从纳米孔读取中检测植物中所有三种环境的全基因组 5mCs。我们使用纳米孔和亚硫酸氢盐测序对拟南芥和水稻进行了测序。我们开发了一种用于训练模型的去噪过程,这使得 DeepSignal-plant 能够在所有三种环境中实现与亚硫酸氢盐测序高度相关的 5mC 检测。此外,DeepSignal-plant 可以分析更多的 5mC 位点,这将有助于提供对不同生物过程的表观遗传机制的更全面的理解。