Amrit Francis R G, Ghazi Arjumand
Department of Pediatrics, University of Pittsburgh School of Medicine, Children's Hospital of Pittsburgh.
Department of Pediatrics, University of Pittsburgh School of Medicine, Children's Hospital of Pittsburgh;
J Vis Exp. 2017 Apr 8(122):55473. doi: 10.3791/55473.
Next generation sequencing (NGS) technologies have revolutionized the nature of biological investigation. Of these, RNA Sequencing (RNA-Seq) has emerged as a powerful tool for gene-expression analysis and transcriptome mapping. However, handling RNA-Seq datasets requires sophisticated computational expertise and poses inherent challenges for biology researchers. This bottleneck has been mitigated by the open access Galaxy project that allows users without bioinformatics skills to analyze RNA-Seq data, and the Database for Annotation, Visualization, and Integrated Discovery (DAVID), a Gene Ontology (GO) term analysis suite that helps derive biological meaning from large data sets. However, for first-time users and bioinformatics' amateurs, self-learning and familiarization with these platforms can be time-consuming and daunting. We describe a straightforward workflow that will help C. elegans researchers to isolate worm RNA, conduct an RNA-Seq experiment and analyze the data using Galaxy and DAVID platforms. This protocol provides stepwise instructions for using the various Galaxy modules for accessing raw NGS data, quality-control checks, alignment, and differential gene expression analysis, guiding the user with parameters at every step to generate a gene list that can be screened for enrichment of gene classes or biological processes using DAVID. Overall, we anticipate that this article will provide information to C. elegans researchers undertaking RNA-Seq experiments for the first time as well as frequent users running a small number of samples.
下一代测序(NGS)技术彻底改变了生物学研究的性质。其中,RNA测序(RNA-Seq)已成为基因表达分析和转录组图谱绘制的强大工具。然而,处理RNA-Seq数据集需要复杂的计算专业知识,这给生物学研究人员带来了固有的挑战。开放获取的Galaxy项目缓解了这一瓶颈,该项目允许没有生物信息学技能的用户分析RNA-Seq数据,还有注释、可视化与整合发现数据库(DAVID),这是一个基因本体(GO)术语分析套件,有助于从大型数据集中得出生物学意义。然而,对于首次使用的用户和生物信息学新手来说,自学并熟悉这些平台可能既耗时又令人生畏。我们描述了一个简单的工作流程,它将帮助秀丽隐杆线虫研究人员分离线虫RNA,进行RNA-Seq实验,并使用Galaxy和DAVID平台分析数据。本方案提供了逐步指导,说明如何使用Galaxy的各种模块来访问原始NGS数据、进行质量控制检查、比对以及差异基因表达分析,在每一步都为用户提供参数指导,以生成一个基因列表,该列表可使用DAVID筛选基因类别或生物学过程的富集情况。总体而言,我们预计本文将为首次进行RNA-Seq实验的秀丽隐杆线虫研究人员以及处理少量样本的频繁使用者提供信息。