Department of Computer Science and Engineering, University of Washington, Seattle, Washington, United States.
J Proteome Res. 2011 Sep 2;10(9):3871-9. doi: 10.1021/pr101196n. Epub 2011 Jul 29.
Computational analysis of mass spectra remains the bottleneck in many proteomics experiments. SEQUEST was one of the earliest software packages to identify peptides from mass spectra by searching a database of known peptides. Though still popular, SEQUEST performs slowly. Crux and TurboSEQUEST have successfully sped up SEQUEST by adding a precomputed index to the search, but the demand for ever-faster peptide identification software continues to grow. Tide, introduced here, is a software program that implements the SEQUEST algorithm for peptide identification and that achieves a dramatic speedup over Crux and SEQUEST. The optimization strategies detailed here employ a combination of algorithmic and software engineering techniques to achieve speeds up to 170 times faster than a recent version of SEQUEST that uses indexing. For example, on a single Xeon CPU, Tide searches 10,000 spectra against a tryptic database of 27,499 Caenorhabditis elegans proteins at a rate of 1550 spectra per second, which compares favorably with a rate of 8.8 spectra per second for a recent version of SEQUEST with index running on the same hardware.
计算质谱分析仍然是许多蛋白质组学实验中的瓶颈。SEQUEST 是最早通过搜索已知肽数据库来识别肽的软件包之一。尽管它仍然很流行,但 SEQUEST 的速度较慢。Crux 和 TurboSEQUEST 通过在搜索中添加预计算索引成功地加速了 SEQUEST,但对更快的肽识别软件的需求仍在不断增长。这里介绍的 Tide 是一个实现肽识别 SEQUEST 算法的软件程序,与 Crux 和 SEQUEST 相比,它实现了显著的加速。这里详细介绍的优化策略采用了算法和软件工程技术的组合,实现了比使用索引的最新 SEQUEST 版本快 170 倍的速度。例如,在单个 Xeon CPU 上,Tide 以每秒 1550 个谱图的速度搜索针对包含 27499 个秀丽隐杆线虫蛋白的胰蛋白酶数据库的 10000 个谱图,这与在相同硬件上运行索引的最新 SEQUEST 版本的每秒 8.8 个谱图的速度相比具有优势。