Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA, USA.
Genomics. 2011 Dec;98(6):469-77. doi: 10.1016/j.ygeno.2011.09.001. Epub 2011 Sep 24.
Analyzing gene expression data at the gene set level greatly improves feature extraction and data interpretation. Currently most efforts in gene set analysis are focused on differential expression analysis--finding gene sets whose genes show first-order relationship with the clinical outcome. However the regulation of the biological system is complex, and much of the change in gene expression dynamics do not manifest in the form of differential expression. At the gene set level, capturing the change in expression dynamics is difficult due to the complexity and heterogeneity of the gene sets. Here we report a systematic approach to detect gene sets that show differential coordination patterns with the rest of the transcriptome, as well as pairs of gene sets that are differentially coordinated with each other. We demonstrate that the method can identify biologically relevant gene sets, many of which do not show first-order relationship with the clinical outcome.
在基因集水平上分析基因表达数据可以极大地改善特征提取和数据解释。目前,基因集分析的大部分工作都集中在差异表达分析上——寻找那些基因与临床结果呈一阶关系的基因集。然而,生物系统的调控是复杂的,许多基因表达动力学的变化并不以差异表达的形式表现出来。在基因集水平上,由于基因集的复杂性和异质性,捕捉表达动力学的变化是很困难的。在这里,我们报告了一种系统的方法来检测与转录组其他部分表现出不同协调模式的基因集,以及与其他基因集表现出不同协调模式的基因对。我们证明,该方法可以识别具有生物学意义的基因集,其中许多基因集与临床结果没有一阶关系。