Biological data Science Division, Research Center for Advanced Science and Technologies, The University of Tokyo, Tokyo, Japan.
Methods Mol Biol. 2023;2632:299-319. doi: 10.1007/978-1-0716-2996-3_21.
RNA modifications regulate multiple aspects of cellular function including RNA splicing, translation, export, decay, stability, and phase separation. One of the comprehensive ways to detect such modifications is by the recent advancement of direct RNA sequencing from Oxford Nanopore Technologies (ONT). However, this method obtains a large amount of data with high complexity in the form of raw current signal that poses a new informatics challenge to accurately detect those modifications. Here, we provide nanoDoc2, a software to detect multiple types of RNA modification from nanopore direct RNA sequencing data. The nanoDoc2 includes a novel signal segmentation algorithm based on the trace value-a base probability feature that is added by the Guppy basecalling program from ONT during processing of the raw signal. The core of nanoDoc2 includes a machine learning algorithm in which a 6-mer segmented raw current signal is analyzed by deep one-class classification using a WaveNet-based neural network. As an output, an RNA modification is detected by a statistical score in each candidate position. Herein, we describe the detailed instructions on how to use nanoDoc2 for signal segmentation, train/test the neural network, and finally predict RNA modifications present in nanopore direct RNA sequencing data.
RNA 修饰调控细胞功能的多个方面,包括 RNA 剪接、翻译、输出、降解、稳定性和相分离。检测这些修饰的一种全面方法是通过最近 Oxford Nanopore Technologies (ONT) 的直接 RNA 测序技术的进步。然而,该方法以原始电流信号的形式获得大量具有高复杂性的数据,这对准确检测这些修饰提出了新的信息学挑战。在这里,我们提供了 nanoDoc2,这是一种从纳米孔直接 RNA 测序数据中检测多种类型 RNA 修饰的软件。nanoDoc2 包括一种基于 trace value-a base probability feature 的新型信号分割算法,该特征是由 ONT 的 Guppy 碱基调用程序在处理原始信号时添加的。nanoDoc2 的核心包括一种机器学习算法,其中使用基于 WaveNet 的神经网络对 6 -mer 分段原始电流信号进行深度单类分类分析。作为输出,在每个候选位置通过统计分数检测 RNA 修饰。在此,我们描述了如何使用 nanoDoc2 进行信号分割、训练/测试神经网络以及最终预测纳米孔直接 RNA 测序数据中存在的 RNA 修饰的详细说明。