Hardy Alexis, Duharcourt Sandra, Defrance Matthieu
Université Libre de Bruxelles, Interuniversity Institute of Bioinformatics in Brussels (IB2), Brussels, Belgium.
Université Paris Cité, CNRS, Institut Jacques Monod, 75013, Paris, France.
Methods Mol Biol. 2023;2624:87-114. doi: 10.1007/978-1-0716-2962-8_7.
Mapping DNA modifications at the base resolution is now possible at the genome level thanks to advances in sequencing technologies. Long-read sequencing data can be used to identify modified base patterns. However, the downstream analysis of Pacific Biosciences (PacBio) or Oxford Nanopore Technologies (ONT) data requires the integration of genomic annotation and comprehensive filtering to prevent the accumulation of artifact signals. We present in this chapter, a linear workflow to fully analyze modified base patterns using the DNA Modification Annotation (DNAModAnnot) package. This workflow includes a thorough filtering based on sequencing quality and false discovery rate estimation and provides tools for a global analysis of DNA modifications. Here, we provide an application example of this workflow with PacBio data and guide the user by explaining expected outputs via a fully integrated Rmarkdown script. This protocol is presented with tips showing how to adapt the provided code for annotating epigenomes of any organism according to the user needs.
由于测序技术的进步,现在能够在基因组水平上以碱基分辨率对DNA修饰进行定位。长读长测序数据可用于识别修饰碱基模式。然而,对太平洋生物科学公司(PacBio)或牛津纳米孔技术公司(ONT)数据的下游分析需要整合基因组注释并进行全面筛选,以防止伪信号的积累。在本章中,我们介绍了一种使用DNA修饰注释(DNAModAnnot)软件包全面分析修饰碱基模式的线性工作流程。该工作流程包括基于测序质量和错误发现率估计的彻底筛选,并提供用于DNA修饰全局分析的工具。在这里,我们提供了一个使用PacBio数据的该工作流程的应用示例,并通过一个完全集成的Rmarkdown脚本解释预期输出,为用户提供指导。本方案还附带了一些提示,展示了如何根据用户需求调整所提供的代码,以注释任何生物体的表观基因组。