Cozzuto Luca, Delgado-Tejedor Anna, Hermoso Pulido Toni, Novoa Eva Maria, Ponomarenko Julia
Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.
Universitat Pompeu Fabra (UPF), Barcelona, Spain.
Methods Mol Biol. 2023;2624:185-205. doi: 10.1007/978-1-0716-2962-8_13.
This chapter describes MasterOfPores v.2 (MoP2), an open-source suite of pipelines for processing and analyzing direct RNA Oxford Nanopore sequencing data. The MoP2 relies on the Nextflow DSL2 framework and Linux containers, thus enabling reproducible data analysis in transcriptomic and epitranscriptomic studies. We introduce the key concepts of MoP2 and provide a step-by-step fully reproducible and complete example of how to use the workflow for the analysis of S. cerevisiae total RNA samples sequenced using MinION flowcells. The workflow starts with the pre-processing of raw FAST5 files, which includes basecalling, read quality control, demultiplexing, filtering, mapping, estimation of per-gene/transcript abundances, and transcriptome assembly, with support of the GPU computing for the basecalling and read demultiplexing steps. The secondary analyses of the workflow focus on the estimation of RNA poly(A) tail lengths and the identification of RNA modifications. The MoP2 code is available at https://github.com/biocorecrg/MOP2 and is distributed under the MIT license.
本章介绍了MasterOfPores v.2(MoP2),这是一套用于处理和分析直接RNA牛津纳米孔测序数据的开源流程套件。MoP2依赖于Nextflow DSL2框架和Linux容器,从而能够在转录组学和表观转录组学研究中进行可重复的数据分析。我们介绍了MoP2的关键概念,并提供了一个逐步的、完全可重复且完整的示例,说明如何使用该工作流程分析使用MinION流动槽测序的酿酒酵母总RNA样本。该工作流程从原始FAST5文件的预处理开始,包括碱基识别、读段质量控制、解复用、过滤、比对、基因/转录本丰度估计以及转录组组装,并在碱基识别和读段解复用步骤中支持GPU计算。该工作流程的二次分析重点在于RNA聚腺苷酸尾长的估计和RNA修饰的识别。MoP2代码可在https://github.com/biocorecrg/MOP2获取,并根据MIT许可进行分发。