Pan Wei
Division of Biostatistics, School of Public Health, University of Minnesota, A460 Mayo Building (MMC 303), Minneapolis, MN 55455-0378, USA.
Bioinformatics. 2003 Jul 22;19(11):1333-40. doi: 10.1093/bioinformatics/btg167.
Recently a class of nonparametric statistical methods, including the empirical Bayes (EB) method, the significance analysis of microarray (SAM) method and the mixture model method (MMM), have been proposed to detect differential gene expression for replicated microarray experiments conducted under two conditions. All the methods depend on constructing a test statistic Z and a so-called null statistic z. The null statistic z is used to provide some reference distribution for Z such that statistical inference can be accomplished. A common way of constructing z is to apply Z to randomly permuted data. Here we point our that the distribution of z may not approximate the null distribution of Z well, leading to possibly too conservative inference. This observation may apply to other permutation-based nonparametric methods. We propose a new method of constructing a null statistic that aims to estimate the null distribution of a test statistic directly.
Using simulated data and real data, we assess and compare the performance of the existing method and our new method when applied in EB, SAM and MMM. Some interesting findings on operating characteristics of EB, SAM and MMM are also reported. Finally, by combining the idea of SAM and MMM, we outline a simple nonparametric method based on the direct use of a test statistic and a null statistic.
最近,一类非参数统计方法,包括经验贝叶斯(EB)方法、微阵列显著性分析(SAM)方法和混合模型方法(MMM),已被提出用于检测在两种条件下进行的重复微阵列实验中的差异基因表达。所有这些方法都依赖于构建一个检验统计量Z和一个所谓的零统计量z。零统计量z用于为Z提供一些参考分布,以便能够进行统计推断。构建z的一种常见方法是将Z应用于随机排列的数据。在这里我们指出,z的分布可能不能很好地近似Z的零分布,从而导致可能过于保守的推断。这一观察结果可能适用于其他基于排列的非参数方法。我们提出了一种构建零统计量的新方法,其目的是直接估计检验统计量的零分布。
使用模拟数据和真实数据,我们评估并比较了现有方法和我们的新方法在应用于EB、SAM和MMM时的性能。还报告了一些关于EB、SAM和MMM操作特性的有趣发现。最后,通过结合SAM和MMM的思想,我们概述了一种基于直接使用检验统计量和零统计量的简单非参数方法。