Suppr超能文献

联合征服:寻找差异表达基因集的单变量和多变量方法。

Unite and conquer: univariate and multivariate approaches for finding differentially expressed gene sets.

机构信息

Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY 14642, USA.

出版信息

Bioinformatics. 2009 Sep 15;25(18):2348-54. doi: 10.1093/bioinformatics/btp406. Epub 2009 Jul 2.

Abstract

MOTIVATION

Recently, many univariate and several multivariate approaches have been suggested for testing differential expression of gene sets between different phenotypes. However, despite a wealth of literature studying their performance on simulated and real biological data, still there is a need to quantify their relative performance when they are testing different null hypotheses.

RESULTS

In this article, we compare the performance of univariate and multivariate tests on both simulated and biological data. In the simulation study we demonstrate that high correlations equally affect the power of both, univariate as well as multivariate tests. In addition, for most of them the power is similarly affected by the dimensionality of the gene set and by the percentage of genes in the set, for which expression is changing between two phenotypes. The application of different test statistics to biological data reveals that three statistics (sum of squared t-tests, Hotelling's T(2), N-statistic), testing different null hypotheses, find some common but also some complementing differentially expressed gene sets under specific settings. This demonstrates that due to complementing null hypotheses each test projects on different aspects of the data and for the analysis of biological data it is beneficial to use all three tests simultaneously instead of focusing exclusively on just one.

摘要

动机

最近,已经提出了许多单变量和几种多变量方法来检验不同表型之间基因集的差异表达。然而,尽管有大量文献研究了它们在模拟和真实生物数据上的性能,但在检验不同零假设时,仍然需要量化它们的相对性能。

结果

在本文中,我们比较了单变量和多变量检验在模拟和生物数据上的性能。在模拟研究中,我们证明了高相关性同样会影响单变量和多变量检验的功效。此外,对于大多数检验,基因集的维数和基因集中表达在两种表型之间变化的基因百分比同样会影响功效。对不同测试统计量在生物数据上的应用表明,三种检验统计量(t 检验平方和、Hotelling's T(2)、N 统计量)检验不同的零假设,在特定条件下找到了一些共同的但也有互补的差异表达基因集。这表明,由于互补的零假设,每个检验都针对数据的不同方面,因此对于生物数据的分析,同时使用这三种检验而不是只关注其中一种会更有益。

相似文献

5
Comparative study of gene set enrichment methods.基因集富集方法的比较研究。
BMC Bioinformatics. 2009 Sep 2;10:275. doi: 10.1186/1471-2105-10-275.
7
Ranking analysis for identifying differentially expressed genes.差异表达基因识别的排名分析。
Genomics. 2011 May;97(5):326-9. doi: 10.1016/j.ygeno.2011.03.002. Epub 2011 Mar 22.

引用本文的文献

本文引用的文献

5
GlobalANCOVA: exploration and assessment of gene group effects.全局协方差分析:基因组效应的探索与评估
Bioinformatics. 2008 Jan 1;24(1):78-85. doi: 10.1093/bioinformatics/btm531. Epub 2007 Nov 17.
6
Comparative evaluation of gene-set analysis methods.基因集分析方法的比较评估
BMC Bioinformatics. 2007 Nov 7;8:431. doi: 10.1186/1471-2105-8-431.
7
A multivariate extension of the gene set enrichment analysis.基因集富集分析的多元扩展。
J Bioinform Comput Biol. 2007 Oct;5(5):1139-53. doi: 10.1142/s0219720007003041.
9
Analyzing gene expression data in terms of gene sets: methodological issues.从基因集角度分析基因表达数据:方法学问题。
Bioinformatics. 2007 Apr 15;23(8):980-7. doi: 10.1093/bioinformatics/btm051. Epub 2007 Feb 15.
10
Extensions to gene set enrichment.基因集富集的扩展
Bioinformatics. 2007 Feb 1;23(3):306-13. doi: 10.1093/bioinformatics/btl599. Epub 2006 Nov 24.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验