van de Wiel Mark A, Kim Kyung In
Department of Mathematics, Vrije Universiteit, De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands.
Biometrics. 2007 Sep;63(3):806-15. doi: 10.1111/j.1541-0420.2006.00736.x.
Given a set of microarray data, the problem is to detect differentially expressed genes, using a false discovery rate (FDR) criterion. As opposed to common procedures in the literature, we do not base the selection criterion on statistical significance only, but also on the effect size. Therefore, we select only those genes that are significantly more differentially expressed than some f-fold (e.g., f = 2). This corresponds to use of an interval null domain for the effect size. Based on a simple error model, we discuss a naive estimator for the FDR, interpreted as the probability that the parameter of interest lies in the null-domain (e.g., mu < log(2)(2) = 1) given that the test statistic exceeds a threshold. We improve the naive estimator by using deconvolution. That is, the density of the parameter of interest is recovered from the data. We study performance of the methods using simulations and real data.
给定一组微阵列数据,问题在于使用错误发现率(FDR)标准来检测差异表达基因。与文献中的常见方法不同,我们的选择标准不仅基于统计显著性,还基于效应大小。因此,我们只选择那些差异表达显著高于某个f倍(例如,f = 2)的基因。这相当于对效应大小使用区间零域。基于一个简单的误差模型,我们讨论了一种用于FDR的朴素估计器,将其解释为在检验统计量超过阈值的情况下,感兴趣参数位于零域(例如,μ < log₂(2) = 1)的概率。我们通过去卷积改进了朴素估计器。也就是说,从数据中恢复感兴趣参数的密度。我们使用模拟和真实数据研究了这些方法的性能。