Centre of Bioinformatics, Institute of Interdisciplinary Studies, University of Allahabad, Prayagraj, India.
Methods Mol Biol. 2022;2416:213-237. doi: 10.1007/978-1-0716-1908-7_14.
Over the last decade, RNA-Sequencing (RNA-Seq) has revolutionized the field of transcriptomics due to its sheer advantage over previous technologies for studying gene expression. Even the domain of stem cell bioinformatics has benefited from these advancements. It has helped look deeper into how the process of pluripotency is maintained by stem cells and how it may be exploited for application in regenerative medicine. However, as it is still an evolving technology, there is no single accepted protocol for RNA-Seq data analysis. From a wide array of tools and/or algorithms available for the purpose, researchers tend to develop a pipeline that is best suited for their sample, experimental design, and computational power. In this tutorial, we describe a pipeline based on open-source tools to analyze RNA-Seq data from naïve and primed state human pluripotent stem cell samples. Precisely, we show how RNA-Seq data can be downloaded from databases, processed, and used to identify differentially expressed genes and construct a co-expression network. Further, we also show how the list of interesting genes obtained from differential expression testing or co-expression network be analyzed to gain biological insights.
在过去的十年中,由于在研究基因表达方面相对于以前的技术具有明显优势,RNA 测序(RNA-Seq)彻底改变了转录组学领域。即使是干细胞生物信息学领域也从这些进展中受益。它帮助我们更深入地了解干细胞如何维持多能性,以及如何将其应用于再生医学。然而,由于它仍然是一项不断发展的技术,因此对于 RNA-Seq 数据分析没有单一的公认方案。为此,研究人员倾向于从各种可用的工具和/或算法中开发出最适合其样本、实验设计和计算能力的管道。在本教程中,我们描述了一个基于开源工具的管道,用于分析原始和诱导状态的人类多能干细胞样本的 RNA-Seq 数据。具体来说,我们展示了如何从数据库下载 RNA-Seq 数据、进行处理,以及如何使用它来识别差异表达基因并构建共表达网络。此外,我们还展示了如何分析差异表达测试或共表达网络中获得的感兴趣基因列表,以获得生物学见解。