Suppr超能文献

蛋白质组学中统计挑战的解决方案是更多的统计学方法,而非更少。

Solution to Statistical Challenges in Proteomics Is More Statistics, Not Less.

作者信息

Serang Oliver, Käll Lukas

机构信息

Department of Informatik, Freie Universität Berlin , Takustr. 9, Berlin 14195, Germany.

Leibniz-Institute for Freshwater Ecology and Inland Fisheries (IGB) , Müggelseedamm 310, Berlin 12587, Germany.

出版信息

J Proteome Res. 2015 Oct 2;14(10):4099-103. doi: 10.1021/acs.jproteome.5b00568. Epub 2015 Aug 28.

Abstract

In any high-throughput scientific study, it is often essential to estimate the percent of findings that are actually incorrect. This percentage is called the false discovery rate (abbreviated "FDR"), and it is an invariant (albeit, often unknown) quantity for any well-formed study. In proteomics, it has become common practice to incorrectly conflate the protein FDR (the percent of identified proteins that are actually absent) with protein-level target-decoy, a particular method for estimating the protein-level FDR. In this manner, the challenges of one approach have been used as the basis for an argument that the field should abstain from protein-level FDR analysis altogether or even the suggestion that the very notion of a protein FDR is flawed. As we demonstrate in simple but accurate simulations, not only is the protein-level FDR an invariant concept, when analyzing large data sets, the failure to properly acknowledge it or to correct for multiple testing can result in large, unrecognized errors, whereby thousands of absent proteins (and, potentially every protein in the FASTA database being considered) can be incorrectly identified.

摘要

在任何高通量科学研究中,估计实际错误发现的比例通常至关重要。这个比例称为错误发现率(缩写为“FDR”),对于任何设计合理的研究来说,它都是一个不变的(尽管通常未知)量。在蛋白质组学中,将蛋白质错误发现率(实际不存在的已鉴定蛋白质的百分比)与蛋白质水平的目标-诱饵错误地混为一谈已成为常见做法,蛋白质水平的目标-诱饵是一种估计蛋白质水平错误发现率的特定方法。通过这种方式,一种方法的挑战被用作论据的基础,即该领域应完全放弃蛋白质水平的错误发现率分析,甚至有人认为蛋白质错误发现率的概念本身就是有缺陷的。正如我们在简单而准确的模拟中所展示的那样,蛋白质水平的错误发现率不仅是一个不变的概念,在分析大数据集时,如果未能正确认识它或对多重检验进行校正,可能会导致大量未被识别的错误,从而可能错误地鉴定出数千种不存在的蛋白质(以及潜在地被考虑的FASTA数据库中的每一种蛋白质)。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验