Center for Biomembrane Research, Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden.
J Proteome Res. 2011 May 6;10(5):2671-8. doi: 10.1021/pr1012619. Epub 2011 Mar 24.
In shotgun proteomics, the quality of a hypothesized match between an observed spectrum and a peptide sequence is quantified by a score function. Because the score function lies at the heart of any peptide identification pipeline, this function greatly affects the final results of a proteomics assay. Consequently, valid statistical methods for assessing the quality of a given score function are extremely important. Previously, several research groups have used samples of known protein composition to assess the quality of a given score function. We demonstrate that this approach is problematic, because the outcome can depend on factors other than the score function itself. We then propose an alternative use of the same type of data to validate a score function. The central idea of our approach is that database matches that are not explained by any protein in the purified sample comprise a robust representation of incorrect matches. We apply our alternative assessment scheme to several commonly used score functions, and we show that our approach generates a reproducible measure of the calibration of a given peptide identification method. Furthermore, we show how our quality test can be useful in the development of novel score functions.
在 shotgun 蛋白质组学中,通过评分函数来量化观测到的光谱与肽序列之间假设匹配的质量。由于评分函数是任何肽鉴定管道的核心,因此该函数极大地影响了蛋白质组学分析的最终结果。因此,评估给定评分函数质量的有效统计方法非常重要。以前,一些研究小组使用已知蛋白质组成的样本来评估给定评分函数的质量。我们证明了这种方法存在问题,因为结果可能取决于评分函数本身以外的因素。然后,我们提出了一种利用相同类型数据来验证评分函数的替代方法。我们方法的核心思想是,数据库匹配不能由纯化样品中的任何蛋白质来解释,这些匹配构成了错误匹配的可靠代表。我们将我们的替代评估方案应用于几种常用的评分函数,并表明我们的方法可以生成给定肽鉴定方法校准的可重复度量。此外,我们还展示了我们的质量测试如何在新型评分函数的开发中发挥作用。