Higgs Richard E, Knierman Michael D, Freeman Angela Bonner, Gelbert Lawrence M, Patil Sandeep T, Hale John E
Lilly Research Laboratories, MS 1533, Lilly Corporate Center, Indianapolis, Indiana 46285, USA.
J Proteome Res. 2007 May;6(5):1758-67. doi: 10.1021/pr0605320. Epub 2007 Mar 31.
We present a wrapper-based approach to estimate and control the false discovery rate for peptide identifications using the outputs from multiple commercially available MS/MS search engines. Features of the approach include the flexibility to combine output from multiple search engines with sequence and spectral derived features in a flexible classification model to produce a score associated with correct peptide identifications. This classification model score from a reversed database search is taken as the null distribution for estimating p-values and false discovery rates using a simple and established statistical procedure. Results from 10 analyses of rat sera on an LTQ-FT mass spectrometer indicate that the method is well calibrated for controlling the proportion of false positives in a set of reported peptide identifications while correctly identifying more peptides than rule-based methods using one search engine alone.
我们提出了一种基于包装器的方法,利用多个商用MS/MS搜索引擎的输出结果来估计和控制肽段鉴定中的错误发现率。该方法的特点包括灵活性,即可以在灵活的分类模型中,将多个搜索引擎的输出结果与序列和光谱衍生特征相结合,以生成与正确肽段鉴定相关的分数。通过反向数据库搜索得到的这个分类模型分数,被用作空分布,以便使用简单且既定的统计程序来估计p值和错误发现率。在LTQ-FT质谱仪上对大鼠血清进行的10次分析结果表明,该方法在控制一组报告的肽段鉴定中的假阳性比例方面校准良好,同时与仅使用一个搜索引擎的基于规则的方法相比,能够正确鉴定出更多的肽段。