Balgley Brian M, Laudeman Tom, Yang Li, Song Tao, Lee Cheng S
Calibrant Biosystems, Gaithersburg, MD 20878, USA.
Mol Cell Proteomics. 2007 Sep;6(9):1599-608. doi: 10.1074/mcp.M600469-MCP200. Epub 2007 May 28.
Peptide identification of tandem mass spectra by a variety of available search algorithms forms the foundation for much of modern day mass spectrometry-based proteomics. Despite the critical importance of proper evaluation and interpretation of the results generated by these algorithms there is still little consistency in their application or understanding of their similarities and differences. A survey was conducted of four tandem mass spectrometry peptide identification search algorithms, including Mascot, Open Mass Spectrometry Search Algorithm, Sequest, and X! Tandem. The same input data, search parameters, and sequence library were used for the searches. Comparisons were based on commonly used scoring methodologies for each algorithm and on the results of a target-decoy approach to sequence library searching. The results indicated that there is little difference in the output of the algorithms so long as consistent scoring procedures are applied. The results showed that some commonly used scoring procedures may lead to excessive false discovery rates. Finally an alternative method for the determination of an optimal cutoff threshold is proposed.
通过各种可用的搜索算法对串联质谱进行肽段鉴定,构成了当今许多基于质谱的蛋白质组学的基础。尽管对这些算法产生的结果进行正确评估和解释至关重要,但在它们的应用以及对其异同的理解方面,仍然缺乏一致性。对四种串联质谱肽段鉴定搜索算法进行了调查,包括 Mascot、开放质谱搜索算法、Sequest 和 X! Tandem。搜索使用相同的输入数据、搜索参数和序列库。比较基于每种算法常用的评分方法以及序列库搜索的目标-诱饵方法的结果。结果表明,只要应用一致的评分程序,算法的输出差异很小。结果表明,一些常用的评分程序可能导致过高的错误发现率。最后提出了一种确定最佳截止阈值的替代方法。