Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, Wisconsin 53706, United States.
J Proteome Res. 2020 May 1;19(5):1975-1981. doi: 10.1021/acs.jproteome.9b00796. Epub 2020 Apr 14.
Statistical significance tests are a common feature in quantitative proteomics workflows. The Student's -test is widely used to compute the statistical significance of a protein's change between two groups of samples. However, the -test's null hypothesis asserts that the difference in means between two groups is exactly zero, often marking small but uninteresting fold-changes as statistically significant. Compensations to address this issue are widely used in quantitative proteomics, but we suggest that a replacement of the -test with a Bayesian approach offers a better path forward. In this article, we describe a Bayesian hypothesis test in which the null hypothesis is an interval rather than a single point at zero; the width of the interval is estimated from population statistics. The improved sensitivity of the method substantially increases the number of truly changing proteins detected in two benchmark data sets (ProteomeXchange identifiers PXD005590 and PXD016470). The method has been implemented within FlashLFQ, an open-source software program that quantifies bottom-up proteomics search results obtained from any search tool. FlashLFQ is rapid, sensitive, and accurate and is available both as an easy-to-use graphical user interface (Windows) and as a command-line tool (Windows/Linux/OSX).
统计学意义检验是定量蛋白质组学工作流程中的一个常见特征。学生 t 检验被广泛用于计算两组样本之间蛋白质变化的统计学意义。然而,t 检验的零假设假设两组之间的均值差异恰好为零,这常常会将微小但无趣的倍数变化标记为具有统计学意义。为了解决这个问题,补偿措施在定量蛋白质组学中被广泛使用,但我们建议用贝叶斯方法替代 t 检验是一个更好的方法。在本文中,我们描述了一种贝叶斯假设检验方法,其中零假设是一个区间而不是单点零;区间的宽度是根据群体统计数据估计的。该方法的灵敏度提高显著增加了两个基准数据集(ProteomeXchange 标识符 PXD005590 和 PXD016470)中真正发生变化的蛋白质数量。该方法已经在 FlashLFQ 中实现,FlashLFQ 是一种开源软件程序,用于定量从任何搜索工具获得的自上而下的蛋白质组学搜索结果。FlashLFQ 快速、敏感、准确,并且有易于使用的图形用户界面(Windows)和命令行工具(Windows/Linux/OSX)两种版本。