Wen Bo, Freestone Jack, Riffle Michael, MacCoss Michael J, Noble William S, Keich Uri
Department of Genome Sciences, University of Washington, Seattle, WA, USA.
School of Mathematics and Statistics, University of Sydney, Sydney, New South Wales, Australia.
Nat Methods. 2025 Jun 16. doi: 10.1038/s41592-025-02719-x.
A critical challenge in mass spectrometry proteomics is accurately assessing error control, especially given that software tools employ distinct methods for reporting errors. Many tools are closed-source and poorly documented, leading to inconsistent validation strategies. Here we identify three prevalent methods for validating false discovery rate (FDR) control: one invalid, one providing only a lower bound, and one valid but under-powered. The result is that the proteomics community has limited insight into actual FDR control effectiveness, especially for data-independent acquisition (DIA) analyses. We propose a theoretical framework for entrapment experiments, allowing us to rigorously characterize different approaches. Moreover, we introduce a more powerful evaluation method and apply it alongside existing techniques to assess existing tools. We first validate our analysis in the better-understood data-dependent acquisition setup, and then, we analyze DIA data, where we find that no DIA search tool consistently controls the FDR, with particularly poor performance on single-cell datasets.
质谱蛋白质组学中的一个关键挑战是准确评估错误控制,特别是考虑到软件工具采用不同的方法来报告错误。许多工具是闭源的且文档记录不完善,导致验证策略不一致。在这里,我们确定了三种验证错误发现率(FDR)控制的普遍方法:一种无效,一种仅提供下限,一种有效但效能不足。结果是蛋白质组学界对实际FDR控制效果的了解有限,尤其是对于数据非依赖采集(DIA)分析。我们提出了一个用于诱捕实验的理论框架,使我们能够严格表征不同的方法。此外,我们引入了一种更强大的评估方法,并将其与现有技术一起应用于评估现有工具。我们首先在理解得更好的数据依赖采集设置中验证我们的分析,然后,我们分析DIA数据,发现在DIA数据中,没有一个搜索工具能始终如一地控制FDR,在单细胞数据集上的表现尤其差。