State Key Laboratory of Forest Tree Genetics and Breeding, Northeast Forestry University, Harbin, Heilongjiang 150040, China.
State Key Laboratory of Forest Tree Genetics and Breeding, Northeast Forestry University, Harbin, Heilongjiang 150040, China; Biotechnology Research Center, School of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI 49931, USA.
Mol Plant. 2015 Feb;8(2):196-206. doi: 10.1016/j.molp.2014.11.012. Epub 2014 Dec 19.
Microarray and RNA-seq experiments have become an important part of modern genomics and systems biology. Obtaining meaningful biological data from these experiments is an arduous task that demands close attention to many details. Negligence at any step can lead to gene expression data containing inadequate or composite information that is recalcitrant for pattern extraction. Therefore, it is imperative to carefully consider experimental design before launching a time-consuming and costly experiment. Contemporarily, most genomics experiments have two objectives: (1) to generate two or more groups of comparable data for identifying differentially expressed genes, gene families, biological processes, or metabolic pathways under experimental conditions; (2) to build local gene regulatory networks and identify hierarchically important regulators governing biological processes and pathways of interest. Since the first objective aims to identify the active molecular identities and the second provides a basis for understanding the underlying molecular mechanisms through inferring causality relationships mediated by treatment, an optimal experiment is to produce biologically relevant and extractable data to meet both objectives without substantially increasing the cost. This review discusses the major issues that researchers commonly face when embarking on microarray or RNA-seq experiments and summarizes important aspects of experimental design, which aim to help researchers deliberate how to generate gene expression profiles with low background noise but with more interaction to facilitate novel biological discoveries in modern plant genomics.
微阵列和 RNA-seq 实验已成为现代基因组学和系统生物学的重要组成部分。从这些实验中获得有意义的生物学数据是一项艰巨的任务,需要密切关注许多细节。任何步骤的疏忽都可能导致基因表达数据包含不足或复合信息,这些信息难以进行模式提取。因此,在进行耗时且昂贵的实验之前,务必仔细考虑实验设计。目前,大多数基因组学实验有两个目标:(1) 生成两组或更多组可比数据,以识别实验条件下差异表达的基因、基因家族、生物过程或代谢途径;(2) 构建局部基因调控网络,并确定控制感兴趣的生物过程和途径的层次重要调控因子。由于第一个目标旨在确定活性分子身份,第二个目标则通过推断由处理介导的因果关系,为理解潜在的分子机制提供基础,因此,最佳实验是生成具有生物学相关性和可提取性的数据,以满足这两个目标,而不会大幅增加成本。本文综述了研究人员在开展微阵列或 RNA-seq 实验时通常面临的主要问题,并总结了实验设计的重要方面,旨在帮助研究人员考虑如何生成具有低背景噪声但更多相互作用的基因表达谱,以促进现代植物基因组学中的新生物学发现。