Department of Chemistry, Stanford University, 364 Lomita Dr., Stanford, California94305, United States.
Department of Neurology and Neurological Sciences, Stanford University School of Medicine, 291 Campus Dr., Stanford, California94305, United States.
Anal Chem. 2023 Feb 7;95(5):2732-2740. doi: 10.1021/acs.analchem.2c03719. Epub 2023 Jan 24.
The multiple hypothesis testing problem is inherent in large-scale quantitative "omic" experiments such as mass spectrometry-based proteomics. Yet, tools for comparing the costs and benefits of different -value correction methods under different experimental conditions are lacking. We performed thousands of simulations of omic experiments under a range of experimental conditions and applied correction using the Benjamini-Hochberg (BH), Bonferroni, and permutation-based false discovery proportion (FDP) estimation methods. The tremendous false discovery rate (FDR) benefit of correction was confirmed in a range of different contexts. No correction method can guarantee a low FDP in a single experiment, but the probability of a high FDP is small when a high number and proportion of corrected -values are significant. On average, correction decreased sensitivity, but the sensitivity costs of BH and permutation were generally modest compared to the FDR benefits. In a given experiment, observed sensitivity was always maintained or decreased by BH and Bonferroni, whereas it was often increased by permutation. Overall, permutation had better FDR and sensitivity than BH. We show how increasing sample size, decreasing variability, or increasing effect size can enable the detection of all true changes while still correcting -values, and we present basic guidelines for omic experimental design. Analysis of an experimental proteomic data set with defined changes corroborated these trends. We developed an R Shiny web application for further exploration and visualization of these models, which we call the Simulator of -value Multiple Hypothesis Correction (SIMPLYCORRECT) and a high-performance R package, permFDP, for easy use of the permutation-based FDP estimation method.
多重假设检验问题是基于质谱的蛋白质组学等大规模定量“组学”实验所固有的。然而,缺乏用于比较不同实验条件下不同 p 值校正方法的成本和效益的工具。我们在一系列实验条件下对组学实验进行了数千次模拟,并使用 Benjamini-Hochberg(BH)、Bonferroni 和基于置换的虚假发现率(FDP)估计方法进行校正。在各种不同的情况下,校正极大地降低了错误发现率(FDR)。没有一种校正方法可以保证在单个实验中具有低 FDP,但当大量校正的 p 值显著时,高 FDP 的概率很小。平均而言,校正降低了灵敏度,但 BH 和置换的灵敏度成本通常与 FDR 收益相比适度。在给定的实验中,BH 和 Bonferroni 始终保持或降低了观察到的灵敏度,而置换则经常增加了灵敏度。总体而言,置换的 FDR 和灵敏度优于 BH。我们展示了如何通过增加样本量、降低变异性或增加效应大小来实现检测所有真实变化的同时仍校正 p 值,并为组学实验设计提供了基本准则。对具有定义变化的实验蛋白质组学数据集的分析证实了这些趋势。我们开发了一个 R Shiny 网络应用程序,用于进一步探索和可视化这些模型,我们称之为 p 值多重假设校正模拟器(SIMPLYCORRECT),以及一个高性能的 R 包 permFDP,用于轻松使用基于置换的 FDP 估计方法。