Love Michael I, Anders Simon, Kim Vladislav, Huber Wolfgang
Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, Massachusetts, USA.
Institute for Molecular Medicine Finland, Helsinki, Finland ; European Molecular Biology Laboratory, Heidelberg, Germany.
F1000Res. 2015 Oct 14;4:1070. doi: 10.12688/f1000research.7035.1. eCollection 2015.
Here we walk through an end-to-end gene-level RNA-Seq differential expression workflow using Bioconductor packages. We will start from the FASTQ files, show how these were aligned to the reference genome, and prepare a count matrix which tallies the number of RNA-seq reads/fragments within each gene for each sample. We will perform exploratory data analysis (EDA) for quality assessment and to explore the relationship between samples, perform differential gene expression analysis, and visually explore the results.
在这里,我们将使用Bioconductor软件包展示一个端到端的基因水平RNA测序差异表达工作流程。我们将从FASTQ文件开始,展示如何将这些文件与参考基因组进行比对,并准备一个计数矩阵,该矩阵统计每个样本中每个基因内RNA测序读数/片段的数量。我们将进行探索性数据分析(EDA)以进行质量评估并探索样本之间的关系,进行差异基因表达分析,并直观地探索结果。