Department of Public Health Sciences , University of California, Davis , Davis , California 95616 , United States.
Department of Computer Science , ETH Zurich , Zurich 8092 , Switzerland.
J Proteome Res. 2019 Sep 6;18(9):3353-3359. doi: 10.1021/acs.jproteome.9b00288. Epub 2019 Aug 23.
The processing of peptide tandem mass spectrometry data involves matching observed spectra against a sequence database. The ranking and calibration of these peptide-spectrum matches can be improved substantially using a machine learning postprocessor. Here, we describe our efforts to speed up one widely used postprocessor, Percolator. The improved software is dramatically faster than the previous version of Percolator, even when using relatively few processors. We tested the new version of Percolator on a data set containing over 215 million spectra and recorded an overall reduction to 23% of the running time as compared to the unoptimized code. We also show that the memory footprint required by these speedups is modest relative to that of the original version of Percolator.
肽串联质谱数据分析涉及将观测到的光谱与序列数据库进行匹配。使用机器学习后处理器可以大大提高这些肽-谱匹配的排名和校准。在这里,我们描述了我们加快广泛使用的后处理器 Percolator 的努力。改进后的软件比以前版本的 Percolator 快得多,即使使用相对较少的处理器也是如此。我们在一个包含超过 2.15 亿个光谱的数据集上测试了新版本的 Percolator,并记录与未优化的代码相比,运行时间总体减少了 23%。我们还表明,与原始版本的 Percolator 相比,这些加速所需的内存占用量是适度的。