Contreras-López Orlando, Moyano Tomás C, Soto Daniela C, Gutiérrez Rodrigo A
Departamento de Genética Molecular y Microbiología, FONDAP Center for Genome Regulation, Millennium Institute for Integrative Systems and Synthetic Biology (MIISSB), Pontificia Universidad Católica de Chile, Santiago, Chile.
Methods Mol Biol. 2018;1761:275-301. doi: 10.1007/978-1-4939-7747-5_21.
The rapid increase in the availability of transcriptomics data generated by RNA sequencing represents both a challenge and an opportunity for biologists without bioinformatics training. The challenge is handling, integrating, and interpreting these data sets. The opportunity is to use this information to generate testable hypothesis to understand molecular mechanisms controlling gene expression and biological processes (Fig. 1). A successful strategy to generate tractable hypotheses from transcriptomics data has been to build undirected network graphs based on patterns of gene co-expression. Many examples of new hypothesis derived from network analyses can be found in the literature, spanning different organisms including plants and specific fields such as root developmental biology.In order to make the process of constructing a gene co-expression network more accessible to biologists, here we provide step-by-step instructions using published RNA-seq experimental data obtained from a public database. Similar strategies have been used in previous studies to advance root developmental biology. This guide includes basic instructions for the operation of widely used open source platforms such as Bio-Linux, R, and Cytoscape. Even though the data we used in this example was obtained from Arabidopsis thaliana, the workflow developed in this guide can be easily adapted to work with RNA-seq data from any organism.
RNA测序产生的转录组学数据可用性迅速增加,这对没有生物信息学培训背景的生物学家来说既是挑战也是机遇。挑战在于处理、整合和解读这些数据集。机遇则是利用这些信息生成可检验的假设,以理解控制基因表达和生物过程的分子机制(图1)。从转录组学数据生成易于处理的假设的一个成功策略是基于基因共表达模式构建无向网络图。文献中有许多从网络分析得出新假设的例子,涵盖不同生物,包括植物,以及特定领域,如根系发育生物学。为了让生物学家更易于进行构建基因共表达网络的过程,我们在此使用从公共数据库获得的已发表RNA-seq实验数据提供分步指导。之前的研究也采用了类似策略来推进根系发育生物学。本指南包括广泛使用的开源平台(如Bio-Linux、R和Cytoscape)操作的基本说明。尽管我们在这个例子中使用的数据来自拟南芥,但本指南中开发的工作流程可以轻松适用于任何生物的RNA-seq数据。