Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.
Integrated Program in Cellular, Molecular, and Biomedical Studies, Columbia University, New York, NY, 10032, USA.
BMC Genomics. 2019 Feb 4;20(Suppl 1):78. doi: 10.1186/s12864-018-5372-8.
Recent advances in single-molecule sequencing techniques, such as Nanopore sequencing, improved read length, increased sequencing throughput, and enabled direct detection of DNA modifications through the analysis of raw signals. These DNA modifications include naturally occurring modifications such as DNA methylations, as well as modifications that are introduced by DNA damage or through synthetic modifications to one of the four standard nucleotides.
To improve the performance of detecting DNA modifications, especially synthetically introduced modifications, we developed a novel computational tool called NanoMod. NanoMod takes raw signal data on a pair of DNA samples with and without modified bases, extracts signal intensities, performs base error correction based on a reference sequence, and then identifies bases with modifications by comparing the distribution of raw signals between two samples, while taking into account of the effects of neighboring bases on modified bases ("neighborhood effects").
We evaluated NanoMod on simulation data sets, based on different types of modifications and different magnitudes of neighborhood effects, and found that NanoMod outperformed other methods in identifying known modified bases. Additionally, we demonstrated superior performance of NanoMod on an E. coli data set with 5mC (5-methylcytosine) modifications.
In summary, NanoMod is a flexible tool to detect DNA modifications with single-base resolution from raw signals in Nanopore sequencing, and will facilitate large-scale functional genomics experiments that use modified nucleotides.
近年来,单分子测序技术(如纳米孔测序)取得了一些进展,提高了读长、测序通量,并能够通过分析原始信号直接检测 DNA 修饰。这些 DNA 修饰包括自然发生的修饰,如 DNA 甲基化,以及通过 DNA 损伤或通过对四个标准核苷酸之一进行合成修饰引入的修饰。
为了提高检测 DNA 修饰(特别是合成引入的修饰)的性能,我们开发了一种名为 NanoMod 的新型计算工具。NanoMod 采用带有和不带有修饰碱基的一对 DNA 样本的原始信号数据,提取信号强度,基于参考序列进行碱基错误校正,然后通过比较两个样本之间原始信号的分布来识别具有修饰的碱基,同时考虑修饰碱基的相邻碱基的影响(“相邻效应”)。
我们基于不同类型的修饰和不同程度的相邻效应,在模拟数据集上评估了 NanoMod,并发现它在识别已知修饰碱基方面优于其他方法。此外,我们还在具有 5mC(5-甲基胞嘧啶)修饰的大肠杆菌数据集上展示了 NanoMod 的优越性能。
总之,NanoMod 是一种灵活的工具,可以从纳米孔测序的原始信号中以单碱基分辨率检测 DNA 修饰,将有助于使用修饰核苷酸的大规模功能基因组学实验。