Tarca Adi Laurentiu, Draghici Sorin, Khatri Purvesh, Hassan Sonia S, Mittal Pooja, Kim Jung-Sun, Kim Chong Jai, Kusanovic Juan Pedro, Romero Roberto
Department of Computer Science, Wayne State University, 431 State Hall, Detroit, MI 48202, USA.
Bioinformatics. 2009 Jan 1;25(1):75-82. doi: 10.1093/bioinformatics/btn577. Epub 2008 Nov 5.
Gene expression class comparison studies may identify hundreds or thousands of genes as differentially expressed (DE) between sample groups. Gaining biological insight from the result of such experiments can be approached, for instance, by identifying the signaling pathways impacted by the observed changes. Most of the existing pathway analysis methods focus on either the number of DE genes observed in a given pathway (enrichment analysis methods), or on the correlation between the pathway genes and the class of the samples (functional class scoring methods). Both approaches treat the pathways as simple sets of genes, disregarding the complex gene interactions that these pathways are built to describe.
We describe a novel signaling pathway impact analysis (SPIA) that combines the evidence obtained from the classical enrichment analysis with a novel type of evidence, which measures the actual perturbation on a given pathway under a given condition. A bootstrap procedure is used to assess the significance of the observed total pathway perturbation. Using simulations we show that the evidence derived from perturbations is independent of the pathway enrichment evidence. This allows us to calculate a global pathway significance P-value, which combines the enrichment and perturbation P-values. We illustrate the capabilities of the novel method on four real datasets. The results obtained on these data show that SPIA has better specificity and more sensitivity than several widely used pathway analysis methods.
SPIA was implemented as an R package available at http://vortex.cs.wayne.edu/ontoexpress/
基因表达类别比较研究可能会识别出样本组之间数百或数千个差异表达(DE)的基因。例如,通过识别受观察到的变化影响的信号通路,可以从这类实验的结果中获得生物学见解。现有的大多数通路分析方法要么侧重于给定通路中观察到的差异表达基因的数量(富集分析方法),要么侧重于通路基因与样本类别之间的相关性(功能类别评分方法)。这两种方法都将通路视为简单的基因集合,而忽略了这些通路旨在描述的复杂基因相互作用。
我们描述了一种新颖的信号通路影响分析(SPIA)方法,该方法将从经典富集分析中获得的证据与一种新型证据相结合,这种新型证据用于衡量在给定条件下给定通路上的实际扰动。使用自举程序来评估观察到的总通路扰动的显著性。通过模拟我们表明,从扰动中得出的证据与通路富集证据无关。这使我们能够计算一个全局通路显著性P值,该值结合了富集和扰动P值。我们在四个真实数据集上展示了这种新方法的能力。在这些数据上获得的结果表明,SPIA比几种广泛使用的通路分析方法具有更好的特异性和更高的灵敏度。
SPIA已作为一个R包实现,可在http://vortex.cs.wayne.edu/ontoexpress/获取