Xie Yang, Pan Wei, Khodursky Arkady B
Division of Biostatistics, School of Public Health, University of Minnesota Minneapolis, MN 55455, USA.
Bioinformatics. 2005 Dec 1;21(23):4280-8. doi: 10.1093/bioinformatics/bti685. Epub 2005 Sep 27.
False discovery rate (FDR) is defined as the expected percentage of false positives among all the claimed positives. In practice, with the true FDR unknown, an estimated FDR can serve as a criterion to evaluate the performance of various statistical methods under the condition that the estimated FDR approximates the true FDR well, or at least, it does not improperly favor or disfavor any particular method. Permutation methods have become popular to estimate FDR in genomic studies. The purpose of this paper is 2-fold. First, we investigate theoretically and empirically whether the standard permutation-based FDR estimator is biased, and if so, whether the bias inappropriately favors or disfavors any method. Second, we propose a simple modification of the standard permutation to yield a better FDR estimator, which can in turn serve as a more fair criterion to evaluate various statistical methods.
Both simulated and real data examples are used for illustration and comparison. Three commonly used test statistics, the sample mean, SAM statistic and Student's t-statistic, are considered. The results show that the standard permutation method overestimates FDR. The overestimation is the most severe for the sample mean statistic while the least for the t-statistic with the SAM-statistic lying between the two extremes, suggesting that one has to be cautious when using the standard permutation-based FDR estimates to evaluate various statistical methods. In addition, our proposed FDR estimation method is simple and outperforms the standard method.
错误发现率(FDR)被定义为所有宣称的阳性结果中假阳性结果的预期百分比。在实际应用中,由于真实的FDR未知,在估计的FDR能较好地近似真实FDR的条件下,或者至少它不会不适当地偏袒或不利于任何特定方法时,估计的FDR可作为评估各种统计方法性能的一个标准。排列方法在基因组研究中已广泛用于估计FDR。本文的目的有两个。首先,我们从理论和实证两方面研究基于标准排列的FDR估计器是否存在偏差,如果存在偏差,该偏差是否不适当地偏袒或不利于任何方法。其次,我们对标准排列提出一种简单的修正,以得到一个更好的FDR估计器,进而可作为一个更公平的标准来评估各种统计方法。
使用模拟数据和实际数据示例进行说明和比较。考虑了三种常用的检验统计量:样本均值、SAM统计量和学生t统计量。结果表明,标准排列方法高估了FDR。对于样本均值统计量,高估最为严重,而对于t统计量则最小,SAM统计量介于两者之间,这表明在使用基于标准排列的FDR估计来评估各种统计方法时必须谨慎。此外,我们提出的FDR估计方法简单且优于标准方法。