Biostatistics Epidemiology Research Design Core, Center for Clinical and Translational Sciences, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
BMC Genomics. 2011 Dec 23;12 Suppl 5(Suppl 5):S7. doi: 10.1186/1471-2164-12-S5-S7.
In microarray experiments with small sample sizes, it is a challenge to estimate p-values accurately and decide cutoff p-values for gene selection appropriately. Although permutation-based methods have proved to have greater sensitivity and specificity than the regular t-test, their p-values are highly discrete due to the limited number of permutations available in very small sample sizes. Furthermore, estimated permutation-based p-values for true nulls are highly correlated and not uniformly distributed between zero and one, making it difficult to use current false discovery rate (FDR)-controlling methods.
We propose a model-based information sharing method (MBIS) that, after an appropriate data transformation, utilizes information shared among genes. We use a normal distribution to model the mean differences of true nulls across two experimental conditions. The parameters of the model are then estimated using all data in hand. Based on this model, p-values, which are uniformly distributed from true nulls, are calculated. Then, since FDR-controlling methods are generally not well suited to microarray data with very small sample sizes, we select genes for a given cutoff p-value and then estimate the false discovery rate.
Simulation studies and analysis using real microarray data show that the proposed method, MBIS, is more powerful and reliable than current methods. It has wide application to a variety of situations.
在小样本量的微阵列实验中,准确估计 p 值并适当地为基因选择确定截止 p 值是一个挑战。虽然基于置换的方法已被证明比常规 t 检验具有更高的灵敏度和特异性,但由于在非常小的样本量中可用的置换数量有限,它们的 p 值高度离散。此外,对于真实零假设的估计置换 p 值高度相关,并且在零和一之间不是均匀分布的,使得难以使用当前的错误发现率(FDR)控制方法。
我们提出了一种基于模型的信息共享方法(MBIS),该方法在进行适当的数据转换后,利用基因之间共享的信息。我们使用正态分布来对两个实验条件下真实零假设的平均差异进行建模。然后使用手头的所有数据来估计模型的参数。基于该模型,计算均匀分布于真实零假设的 p 值。然后,由于 FDR 控制方法通常不适用于具有非常小样本量的微阵列数据,因此我们为给定的截止 p 值选择基因,然后估计假发现率。
使用真实微阵列数据的模拟研究和分析表明,所提出的方法 MBIS 比当前方法更强大和可靠。它广泛适用于各种情况。