Pollier Jacob, Rombauts Stephane, Goossens Alain
Department of Plant Systems Biology, VIB, Gent, Belgium.
Methods Mol Biol. 2013;1011:305-15. doi: 10.1007/978-1-62703-414-2_24.
The recent development of various deep sequencing techniques has led to the most powerful transcript profiling method available to date, RNA sequencing or RNA-Seq. Besides the identification of new genes and new splice variants of known genes, RNA-Seq allows to compare the whole transcriptome of any organism under two or more experimental conditions, such as before and after jasmonate treatment. However, the vast amounts of data generated during RNA-Seq experiments require complex computational methods for read mapping and expression quantification. Here, we describe a detailed protocol for the analysis of deep sequencing data, starting from the raw RNA-Seq reads. First, a quality check is performed on the raw reads to assess the quality of the sequencing. Subsequently, adapters and low-quality sequences are trimmed off the raw reads. The resulting processed reads are mapped to the reference genome, and the mapped reads are counted to generate expression data for the annotated genes for each sample. This method can be used for the analysis of RNA-Seq data of any organism for which a reference genome is available.
近年来,各种深度测序技术的发展催生了迄今为止最强大的转录本分析方法——RNA测序(RNA-Seq)。除了能够鉴定新基因以及已知基因的新剪接变体之外,RNA-Seq还能比较任何生物体在两种或更多实验条件下(如茉莉酸处理前后)的整个转录组。然而,RNA-Seq实验过程中产生的海量数据需要复杂的计算方法来进行读段比对和表达定量分析。在此,我们描述了一个从原始RNA-Seq读段开始分析深度测序数据的详细方案。首先,对原始读段进行质量检查以评估测序质量。随后,从原始读段中去除接头序列和低质量序列。将得到的处理后读段比对到参考基因组上,并对比对上的读段进行计数,从而为每个样本的注释基因生成表达数据。该方法可用于分析任何拥有参考基因组的生物体的RNA-Seq数据。