在小型微阵列实验中估计p值。
Estimating p-values in small microarray experiments.
作者信息
Yang Hyuna, Churchill Gary
机构信息
The Jackson Laboratory, Bar Harbor, ME 04609, USA.
出版信息
Bioinformatics. 2007 Jan 1;23(1):38-43. doi: 10.1093/bioinformatics/btl548. Epub 2006 Oct 30.
MOTIVATION
Microarray data typically have small numbers of observations per gene, which can result in low power for statistical tests. Test statistics that borrow information from data across all of the genes can improve power, but these statistics have non-standard distributions, and their significance must be assessed using permutation analysis. When sample sizes are small, the number of distinct permutations can be severely limited, and pooling the permutation-derived test statistics across all genes has been proposed. However, the null distribution of the test statistics under permutation is not the same for equally and differentially expressed genes. This can have a negative impact on both p-value estimation and the power of information borrowing statistics.
RESULTS
We investigate permutation based methods for estimating p-values. One of methods that uses pooling from a selected subset of the data are shown to have the correct type I error rate and to provide accurate estimates of the false discovery rate (FDR). We provide guidelines to select an appropriate subset. We also demonstrate that information borrowing statistics have substantially increased power compared to the t-test in small experiments.
动机
微阵列数据通常每个基因的观测值较少,这可能导致统计检验的功效较低。从所有基因的数据中借用信息的检验统计量可以提高功效,但这些统计量具有非标准分布,其显著性必须使用置换分析来评估。当样本量较小时,不同置换的数量可能会受到严重限制,因此有人提出将所有基因的置换衍生检验统计量合并起来。然而,对于等量表达基因和差异表达基因,置换下检验统计量的零分布并不相同。这可能对p值估计和信息借用统计量的功效都产生负面影响。
结果
我们研究了基于置换的p值估计方法。其中一种使用从选定数据子集进行合并的方法被证明具有正确的I型错误率,并能提供对错误发现率(FDR)的准确估计。我们提供了选择合适子集的指导原则。我们还证明,在小型实验中,与t检验相比,信息借用统计量的功效有显著提高。