Department of Obstetrics and Gynecology, Renmin Hospital of Wuhan University.
Department of Pathology, Shanghai Skin Disease Hospital, Tongji University School of Medicine.
J Vis Exp. 2021 Sep 18(175). doi: 10.3791/62528.
RNA sequencing (RNA-seq) is one of the most widely used technologies in transcriptomics as it can reveal the relationship between the genetic alteration and complex biological processes and has great value in diagnostics, prognostics, and therapeutics of tumors. Differential analysis of RNA-seq data is crucial to identify aberrant transcriptions, and limma, EdgeR and DESeq2 are efficient tools for differential analysis. However, RNA-seq differential analysis requires certain skills with R language and the ability to choose an appropriate method, which is lacking in the curriculum of medical education. Herein, we provide the detailed protocol to identify differentially expressed genes (DEGs) between cholangiocarcinoma (CHOL) and normal tissues through limma, DESeq2 and EdgeR, respectively, and the results are shown in volcano plots and Venn diagrams. The three protocols of limma, DESeq2 and EdgeR are similar but have different steps among the processes of the analysis. For example, a linear model is used for statistics in limma, while the negative binomial distribution is used in edgeR and DESeq2. Additionally, the normalized RNA-seq count data is necessary for EdgeR and limma but is not necessary for DESeq2. Here, we provide a detailed protocol for three differential analysis methods: limma, EdgeR and DESeq2. The results of the three methods are partly overlapping. All three methods have their own advantages, and the choice of method only depends on the data.
RNA 测序(RNA-seq)是转录组学中应用最广泛的技术之一,因为它可以揭示遗传改变与复杂生物过程之间的关系,在肿瘤的诊断、预后和治疗方面具有重要价值。RNA-seq 数据的差异分析对于识别异常转录至关重要,limma、EdgeR 和 DESeq2 是差异分析的有效工具。然而,RNA-seq 差异分析需要具备一定的 R 语言技能,并能够选择合适的方法,而这在医学教育课程中是缺乏的。在这里,我们分别通过 limma、DESeq2 和 EdgeR 提供了鉴定胆管癌(CHOL)和正常组织之间差异表达基因(DEGs)的详细方案,并通过火山图和 Venn 图展示了结果。limma、DESeq2 和 EdgeR 的三个方案虽然相似,但在分析过程的各个步骤中却有所不同。例如,limma 中使用线性模型进行统计,而 edgeR 和 DESeq2 中则使用负二项式分布。此外,edgeR 和 limma 需要归一化的 RNA-seq 计数数据,但 DESeq2 不需要。在这里,我们提供了三种差异分析方法(limma、EdgeR 和 DESeq2)的详细方案。这三种方法的结果部分重叠。所有三种方法都有各自的优势,方法的选择仅取决于数据。
Methods Mol Biol. 2020
Methods Mol Biol. 2019
BMC Genomics. 2022-3-25