Man M Z, Wang X, Wang Y
Biostatisties, PGRD, 2800 Plymouth Road, Ann Arbor, MI 48105, USA.
Bioinformatics. 2000 Nov;16(11):953-9. doi: 10.1093/bioinformatics/16.11.953.
The Serial Analysis of Gene Expression (SAGE) technology determines the expression level of a gene by measuring the frequency of a sequence tag derived from the corresponding mRNA transcript. Several statistical tests have been developed to detect significant differences in tag frequency between two samples. However, which one of these tests has the greatest power to detect real changes remains undetermined.
This paper compares three statistical tests for detecting significant changes of gene expression in SAGE experiments. The comparison makes use of Monte Carlo simulation that, in essence, generates "virtual" SAGE experiments. Our analysis shows that the Chi-square test has the best power and robustness. Since the POWER_ SAGE program can easily run "virtual" SAGE studies with different combinations of sample size and tag frequency and determine the power for each combination, it can serve as a useful tool for planning SAGE experiments.
The POWER_ SAGE software is available upon request from the authors.
基因表达序列分析(SAGE)技术通过测量源自相应mRNA转录本的序列标签的频率来确定基因的表达水平。已经开发了几种统计测试来检测两个样本之间标签频率的显著差异。然而,这些测试中哪一种检测真实变化的能力最强仍未确定。
本文比较了三种用于检测SAGE实验中基因表达显著变化的统计测试。该比较利用了蒙特卡罗模拟,本质上是生成“虚拟”SAGE实验。我们的分析表明,卡方检验具有最佳的能力和稳健性。由于POWER_SAGE程序可以轻松地使用不同样本量和标签频率组合运行“虚拟”SAGE研究,并确定每种组合的能力,它可以作为规划SAGE实验的有用工具。
可根据作者要求提供POWER_SAGE软件。