Institute for Systems Biology, Seattle, Washington 98109, USA.
Mol Cell Proteomics. 2013 Sep;12(9):2383-93. doi: 10.1074/mcp.R113.027797. Epub 2013 May 29.
A crucial component of the analysis of shotgun proteomics datasets is the search engine, an algorithm that attempts to identify the peptide sequence from the parent molecular ion that produced each fragment ion spectrum in the dataset. There are many different search engines, both commercial and open source, each employing a somewhat different technique for spectrum identification. The set of high-scoring peptide-spectrum matches for a defined set of input spectra differs markedly among the various search engine results; individual engines each provide unique correct identifications among a core set of correlative identifications. This has led to the approach of combining the results from multiple search engines to achieve improved analysis of each dataset. Here we review the techniques and available software for combining the results of multiple search engines and briefly compare the relative performance of these techniques.
蛋白质组学数据集的分析中一个关键的组成部分是搜索引擎,它是一种试图从产生数据集内每个片段离子谱的母分子离子中识别肽序列的算法。有许多不同的搜索引擎,包括商业和开源的,它们都采用了略有不同的技术来进行谱识别。在一组定义的输入光谱中,针对各种搜索引擎结果的高得分肽-光谱匹配集有明显的差异;每个引擎都在一个相关识别的核心集内提供了独特的正确识别。这就导致了采用组合来自多个搜索引擎的结果来实现对每个数据集的改进分析的方法。在这里,我们综述了用于组合多个搜索引擎结果的技术和可用软件,并简要比较了这些技术的相对性能。