Lun Aaron T L, Chen Yunshun, Smyth Gordon K
Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
Methods Mol Biol. 2016;1418:391-416. doi: 10.1007/978-1-4939-3578-9_19.
RNA sequencing (RNA-seq) is widely used to profile transcriptional activity in biological systems. Here we present an analysis pipeline for differential expression analysis of RNA-seq experiments using the Rsubread and edgeR software packages. The basic pipeline includes read alignment and counting, filtering and normalization, modelling of biological variability and hypothesis testing. For hypothesis testing, we describe particularly the quasi-likelihood features of edgeR. Some more advanced downstream analysis steps are also covered, including complex comparisons, gene ontology enrichment analyses and gene set testing. The code required to run each step is described, along with an outline of the underlying theory. The chapter includes a case study in which the pipeline is used to study the expression profiles of mammary gland cells in virgin, pregnant and lactating mice.
RNA测序(RNA-seq)被广泛用于分析生物系统中的转录活性。在此,我们介绍一种使用Rsubread和edgeR软件包对RNA-seq实验进行差异表达分析的分析流程。基本流程包括读段比对与计数、过滤与标准化、生物学变异性建模以及假设检验。对于假设检验,我们特别描述了edgeR的拟似然特征。还涵盖了一些更高级的下游分析步骤,包括复杂比较、基因本体富集分析和基因集检验。描述了运行每个步骤所需的代码以及基础理论概述。本章包含一个案例研究,其中该流程用于研究未生育、怀孕和哺乳期小鼠乳腺细胞的表达谱。