Minnesota Supercomputing Institute for Advanced Computational Research, University of Minnesota, Minneapolis, MN 55455, USA.
BMC Bioinformatics. 2010 Sep 16;11:465. doi: 10.1186/1471-2105-11-465.
In microarray gene expression profiling experiments, differentially expressed genes (DEGs) are detected from among tens of thousands of genes on an array using statistical tests. It is important to control the number of false positives or errors that are present in the resultant DEG list. To date, more than 20 different multiple test methods have been reported that compute overall Type I error rates in microarray experiments. However, these methods share the following dilemma: they have low power in cases where only a small number of DEGs exist among a large number of total genes on the array.
This study contrasts parallel multiplicity of objectively related tests against the traditional simultaneousness of subjectively related tests and proposes a new assessment called the Error Discovery Rate (EDR) for evaluating multiple test comparisons in microarray experiments. Parallel multiple tests use only the negative genes that parallel the positive genes to control the error rate; while simultaneous multiple tests use the total unchanged gene number for error estimates. Here, we demonstrate that the EDR method exhibits improved performance over other methods in specificity and sensitivity in testing expression data sets with sequence digital expression confirmation, in examining simulation data, as well as for three experimental data sets that vary in the proportion of DEGs. The EDR method overcomes a common problem of previous multiple test procedures, namely that the Type I error rate detection power is low when the total gene number used is large but the DEG number is small.
Microarrays are extensively used to address many research questions. However, there is potential to improve the sensitivity and specificity of microarray data analysis by developing improved multiple test comparisons. This study proposes a new view of multiplicity in microarray experiments and the EDR provides an alternative multiple test method for Type I error control in microarray experiments.
在微阵列基因表达谱实验中,使用统计检验从微阵列上的数以万计的基因中检测差异表达基因(DEGs)。控制在产生的 DEG 列表中存在的假阳性或错误的数量非常重要。迄今为止,已经报道了 20 多种不同的多重检验方法,这些方法在微阵列实验中计算总体Ⅰ型错误率。然而,这些方法都存在一个共同的困境:当微阵列上的总基因中只有少数 DEGs 时,它们的功效较低。
本研究对比了客观相关的平行多重检验与主观相关的同时多重检验,并提出了一种新的评估方法,称为错误发现率(EDR),用于评估微阵列实验中的多重检验比较。平行多重检验仅使用与阳性基因平行的阴性基因来控制错误率;而同时多重检验则使用总不变基因数进行错误估计。在这里,我们证明了在具有序列数字表达确认的表达数据集中,在检查模拟数据以及在 DEG 比例不同的三个实验数据集上,EDR 方法在特异性和敏感性方面都优于其他方法。EDR 方法克服了以前多重检验程序的一个常见问题,即在使用的总基因数很大但 DEG 数较小时,Ⅰ型错误率检测功效较低。
微阵列被广泛用于解决许多研究问题。然而,通过开发改进的多重检验比较,可以提高微阵列数据分析的灵敏度和特异性。本研究提出了微阵列实验中多重性的新观点,EDR 为微阵列实验中的Ⅰ型错误控制提供了一种替代多重检验方法。