Huang Qianli, Sun Ming-An, Yan Ping
School of Biological and Medical Engineering, Hefei University of Technology, Hefei, China.
Epigenomics and Computational Biology Lab, Biocomplexity Institute of Virginia Tech, Blacksburg, VA, USA.
Methods Mol Biol. 2018;1751:35-55. doi: 10.1007/978-1-4939-7710-9_3.
In recent years, transcriptome sequencing has become very popular, encompassing a wide variety of applications from simple mRNA profiling to discovery and analysis of the entire transcriptome. One of the most common aims of transcriptome sequencing is to identify genes that are differentially expressed (DE) between two or more biological conditions, and to infer associated pathways and gene networks from expression profiles. It can provide avenues for further systematic investigation into potential biologic mechanisms. Gene Set (GS) enrichment analysis is a popular approach to identify pathways or sets of genes that are significantly enriched in the context of differentially expressed genes. However, the approach considers a pathway as a simple gene collection disregarding knowledge of gene or protein interactions. In contrast, topology-based methods integrate the topological structure of a pathway and gene network into the analysis. To provide a panoramic view of such approaches, this chapter demonstrates several recent computational workflows, including gene set enrichment and topology-based methods, for analysis of the DE pathways and gene networks from transcriptome-wide sequencing data.
近年来,转录组测序变得非常流行,涵盖了从简单的mRNA谱分析到整个转录组的发现与分析等各种各样的应用。转录组测序最常见的目标之一是识别在两种或更多生物条件之间差异表达(DE)的基因,并从表达谱推断相关的途径和基因网络。它可以为进一步系统研究潜在的生物学机制提供途径。基因集(GS)富集分析是一种流行的方法,用于识别在差异表达基因背景下显著富集的途径或基因集。然而,该方法将一条途径视为一个简单的基因集合,而忽略了基因或蛋白质相互作用的知识。相比之下,基于拓扑结构的方法将途径和基因网络的拓扑结构整合到分析中。为了全面展示此类方法,本章展示了几种最近的计算工作流程,包括基因集富集和基于拓扑结构的方法,用于分析转录组范围测序数据中的差异表达途径和基因网络。