Hunt Daniel L, Cheng Cheng, Pounds Stanley
Department of Biostatistics, St. Jude Children's Research Hospital, 332 N. Lauderdale St., Memphis, TN 38105-2794 USA.
Comput Stat Data Anal. 2009 Mar 15;53(5):1688-1700. doi: 10.1016/j.csda.2008.01.013.
In differential expression analysis of microarray data, it is common to assume independence among null hypotheses (and thus gene expression levels). The independence assumption implies that the number of false rejections V follows a binomial distribution and leads to an estimator of the empirical false discovery rate (eFDR). The number of false rejections V is modeled with the beta-binomial distribution. An estimator of the beta-binomial false discovery rate (bbFDR) is then derived. This approach accounts for how the correlation among non-differentially expressed genes influences the distribution of V. Permutations are used to generate the observed values for V under the null hypotheses and a beta-binomial distribution is fit to the values of V. The bbFDR estimator is compared to the eFDR estimator in simulation studies of correlated non-differentially expressed genes and is found to outperform the eFDR for certain scenarios. As an example, this method is also used to perform an analysis that compares the gene expression of soft tissue sarcoma samples to normal tissue samples.
在微阵列数据的差异表达分析中,通常假定零假设(以及基因表达水平)之间相互独立。独立性假设意味着错误拒绝的数量V服从二项分布,并由此得出经验性错误发现率(eFDR)的估计值。错误拒绝的数量V采用β-二项分布进行建模。然后推导出β-二项式错误发现率(bbFDR)的估计值。该方法考虑了非差异表达基因之间的相关性如何影响V的分布。通过排列生成零假设下V的观测值,并将β-二项分布拟合到V的值。在相关非差异表达基因的模拟研究中,将bbFDR估计值与eFDR估计值进行比较,发现在某些情况下bbFDR优于eFDR。例如,该方法还用于进行一项分析,比较软组织肉瘤样本与正常组织样本的基因表达。