Department of Mathematics and Statistics, Utah State University, Logan, UT 84322, USA.
BMC Bioinformatics. 2010 May 26;11:281. doi: 10.1186/1471-2105-11-281.
Statistical methods to tentatively identify differentially expressed genes in microarray studies typically assume larger sample sizes than are practical or even possible in some settings.
The performance of several probe-level and probeset models was assessed graphically and numerically using three spike-in datasets. Based on the Affymetrix GeneChip, a novel nested factorial model was developed and found to perform competitively on small-sample spike-in experiments.
Statistical methods with test statistics related to the estimated log fold change tend to be more consistent in their performance on small-sample gene expression data. For such small-sample experiments, the nested factorial model can be a useful statistical tool. This method is implemented in freely-available R code (affyNFM), available with a tutorial document at http://www.stat.usu.edu/~jrstevens.
在微阵列研究中,用于初步识别差异表达基因的统计方法通常假设更大的样本量,而在某些情况下,实际或甚至可能无法实现这些样本量。
使用三个 Spike-in 数据集,以图形和数值方式评估了几种探针水平和探针集模型的性能。基于 Affymetrix GeneChip,开发了一种新颖的嵌套因子模型,并发现它在小样本 Spike-in 实验中具有竞争力。
与估计的对数倍变化相关的检验统计量的统计方法在小样本基因表达数据上的性能往往更一致。对于这种小样本实验,嵌套因子模型可以是一个有用的统计工具。这种方法以免费的 R 代码(affyNFM)实现,在 http://www.stat.usu.edu/~jrstevens 上提供了教程文档。