Department of Medical Protein Research, VIB, Ghent, Belgium.
J Proteome Res. 2011 Dec 2;10(12):5555-61. doi: 10.1021/pr200913a. Epub 2011 Oct 26.
Proteome identification using peptide-centric proteomics techniques is a routinely used analysis technique. One of the most powerful and popular methods for the identification of peptides from MS/MS spectra is protein database matching using search engines. Significance thresholding through false discovery rate (FDR) estimation by target/decoy searches is used to ensure the retention of predominantly confident assignments of MS/MS spectra to peptides. However, shortcomings have become apparent when such decoy searches are used to estimate the FDR. To study these shortcomings, we here introduce a novel kind of decoy database that contains isobaric mutated versions of the peptides that were identified in the original search. Because of the supervised way in which the entrapment sequences are generated, we call this a directed decoy database. Since the peptides found in our directed decoy database are thus specifically designed to look quite similar to the forward identifications, the limitations of the existing search algorithms in making correct calls in such strongly confusing situations can be analyzed. Interestingly, for the vast majority of confidently identified peptide identifications, a directed decoy peptide-to-spectrum match can be found that has a better or equal match score than the forward match score, highlighting an important issue in the interpretation of peptide identifications in present-day high-throughput proteomics.
基于肽段的蛋白质组学技术进行蛋白质组鉴定是一种常规使用的分析技术。在从 MS/MS 谱中鉴定肽段时,最强大和最受欢迎的方法之一是使用搜索引擎进行蛋白质数据库匹配。通过目标/诱饵搜索进行错误发现率 (FDR) 估计进行显著性阈值处理,以确保将 MS/MS 谱主要有信心地分配给肽段。然而,当使用这种诱饵搜索来估计 FDR 时,已经明显出现了一些缺点。为了研究这些缺点,我们在这里引入了一种新型的诱饵数据库,其中包含在原始搜索中鉴定的肽段的等质量突变版本。由于捕获序列是通过监督方式生成的,因此我们将其称为定向诱饵数据库。由于我们的定向诱饵数据库中发现的肽段是专门设计的,因此看起来与正向鉴定非常相似,因此可以分析现有搜索算法在这种强烈混乱的情况下进行正确调用的局限性。有趣的是,对于绝大多数有信心鉴定的肽段鉴定,都可以找到一个定向诱饵肽段与谱图的匹配,其匹配得分优于正向匹配得分,突出了当今高通量蛋白质组学中肽段鉴定解释的一个重要问题。