Li Dongmei, Xie Zidian, Zand Martin, Fogg Thomas, Dye Timothy
Clinical and Translational Science Institute, School of Medicine and Dentistry, University of Rochester, 265 Crittenden Boulevard CU 420708, Rochester, 14642, NY, USA.
Goergen Institute for Data Science, University of Rochester, Computer Studies Building, Rochester, 14642, NY, USA.
BMC Bioinformatics. 2017 Jan 3;18(1):1. doi: 10.1186/s12859-016-1414-x.
Stability of multiple testing procedures, defined as the standard deviation of total number of discoveries, can be used as an indicator of variability of multiple testing procedures. Improving stability of multiple testing procedures can help to increase the consistency of findings from replicated experiments. Benjamini-Hochberg's and Storey's q-value procedures are two commonly used multiple testing procedures for controlling false discoveries in genomic studies. Storey's q-value procedure has higher power and lower stability than Benjamini-Hochberg's procedure. To improve upon the stability of Storey's q-value procedure and maintain its high power in genomic data analysis, we propose a new multiple testing procedure, named Bon-EV, to control false discovery rate (FDR) based on Bonferroni's approach.
Simulation studies show that our proposed Bon-EV procedure can maintain the high power of the Storey's q-value procedure and also result in better FDR control and higher stability than Storey's q-value procedure for samples of large size(30 in each group) and medium size (15 in each group) for either independent, somewhat correlated, or highly correlated test statistics. When sample size is small (5 in each group), our proposed Bon-EV procedure has performance between the Benjamini-Hochberg procedure and the Storey's q-value procedure. Examples using RNA-Seq data show that the Bon-EV procedure has higher stability than the Storey's q-value procedure while maintaining equivalent power, and higher power than the Benjamini-Hochberg's procedure.
For medium or large sample sizes, the Bon-EV procedure has improved FDR control and stability compared with the Storey's q-value procedure and improved power compared with the Benjamini-Hochberg procedure. The Bon-EV multiple testing procedure is available as the BonEV package in R for download at https://CRAN.R-project.org/package=BonEV .
多重检验程序的稳定性定义为发现总数的标准差,可作为多重检验程序变异性的一个指标。提高多重检验程序的稳定性有助于提高重复实验结果的一致性。本雅明尼-霍奇伯格程序和斯托里的q值程序是基因组研究中控制错误发现的两种常用多重检验程序。斯托里的q值程序比本雅明尼-霍奇伯格程序具有更高的检验效能和更低的稳定性。为了在基因组数据分析中提高斯托里q值程序的稳定性并保持其高检验效能,我们基于邦费罗尼方法提出了一种新的多重检验程序,称为Bon-EV,以控制错误发现率(FDR)。
模拟研究表明,对于大样本量(每组30个)和中等样本量(每组15个)的样本,无论是独立、中度相关还是高度相关的检验统计量,我们提出的Bon-EV程序都能保持斯托里q值程序的高检验效能,并且在控制FDR方面比斯托里q值程序更好,稳定性更高。当样本量较小时(每组5个),我们提出的Bon-EV程序的性能介于本雅明尼-霍奇伯格程序和斯托里q值程序之间。使用RNA测序数据的例子表明,Bon-EV程序在保持同等检验效能的同时比斯托里q值程序具有更高的稳定性,并且比本雅明尼-霍奇伯格程序具有更高的检验效能。
对于中等或大样本量,与斯托里q值程序相比,Bon-EV程序在控制FDR和稳定性方面有所改进,与本雅明尼-霍奇伯格程序相比,检验效能有所提高。Bon-EV多重检验程序可作为R语言中的BonEV包在https://CRAN.R-project.org/package=BonEV上下载。