Istituto di Studi sui Sistemi Intelligenti per l'Automazione, CNR, Bari, Italy.
BMC Bioinformatics. 2009 Sep 2;10:275. doi: 10.1186/1471-2105-10-275.
The analysis of high-throughput gene expression data with respect to sets of genes rather than individual genes has many advantages. A variety of methods have been developed for assessing the enrichment of sets of genes with respect to differential expression. In this paper we provide a comparative study of four of these methods: Fisher's exact test, Gene Set Enrichment Analysis (GSEA), Random-Sets (RS), and Gene List Analysis with Prediction Accuracy (GLAPA). The first three methods use associative statistics, while the fourth uses predictive statistics. We first compare all four methods on simulated data sets to verify that Fisher's exact test is markedly worse than the other three approaches. We then validate the other three methods on seven real data sets with known genetic perturbations and then compare the methods on two cancer data sets where our a priori knowledge is limited.
The simulation study highlights that none of the three method outperforms all others consistently. GSEA and RS are able to detect weak signals of deregulation and they perform differently when genes in a gene set are both differentially up and down regulated. GLAPA is more conservative and large differences between the two phenotypes are required to allow the method to detect differential deregulation in gene sets. This is due to the fact that the enrichment statistic in GLAPA is prediction error which is a stronger criteria than classical two sample statistic as used in RS and GSEA. This was reflected in the analysis on real data sets as GSEA and RS were seen to be significant for particular gene sets while GLAPA was not, suggesting a small effect size. We find that the rank of gene set enrichment induced by GLAPA is more similar to RS than GSEA. More importantly, the rankings of the three methods share significant overlap.
The three methods considered in our study recover relevant gene sets known to be deregulated in the experimental conditions and pathologies analyzed. There are differences between the three methods and GSEA seems to be more consistent in finding enriched gene sets, although no method uniformly dominates over all data sets. Our analysis highlights the deep difference existing between associative and predictive methods for detecting enrichment and the use of both to better interpret results of pathway analysis. We close with suggestions for users of gene set methods.
相对于单个基因而言,针对基因集进行高通量基因表达数据分析具有诸多优势。目前已经开发出多种方法来评估基因集在差异表达方面的富集情况。本文对其中的 4 种方法进行了比较研究:Fisher 精确检验、基因集富集分析(GSEA)、随机集(RS)和基于预测准确性的基因列表分析(GLAPA)。前 3 种方法使用关联统计学,而第 4 种方法使用预测统计学。我们首先在模拟数据集上比较了这 4 种方法,以验证 Fisher 精确检验明显不如其他 3 种方法。然后,我们在 7 个具有已知遗传扰动的真实数据集上验证了其他 3 种方法,之后在两个我们的先验知识有限的癌症数据集上比较了这些方法。
模拟研究强调,没有一种方法始终优于其他方法。GSEA 和 RS 能够检测到较弱的失调信号,并且当基因集中的基因同时上调和下调时,它们的表现方式不同。GLAPA 更为保守,需要两个表型之间存在较大差异,才能允许该方法检测基因集中的差异失调。这是由于 GLAPA 中的富集统计量是预测误差,这比 RS 和 GSEA 中使用的经典两样本统计量更为严格。这在真实数据集的分析中得到了反映,因为对于特定的基因集,GSEA 和 RS 被认为是显著的,而 GLAPA 则不是,这表明效应量较小。我们发现,GLAPA 诱导的基因集富集排名与 RS 更为相似,而与 GSEA 则不相似。更重要的是,这 3 种方法的排名有很大的重叠。
我们研究中考虑的 3 种方法都能够恢复在分析的实验条件和病理中已知失调的相关基因集。这 3 种方法之间存在差异,GSEA 似乎更一致地找到富集的基因集,尽管没有一种方法在所有数据集上都占主导地位。我们的分析强调了用于检测富集的关联方法和预测方法之间存在的深刻差异,并建议同时使用这两种方法以更好地解释通路分析的结果。最后,我们为基因集方法的使用者提出了建议。