Suppr超能文献

X!!串联,一种在商用计算机集群上并行运行X!串联的改进方法。

X!!Tandem, an improved method for running X!tandem in parallel on collections of commodity computers.

作者信息

Bjornson Robert D, Carriero Nicholas J, Colangelo Christopher, Shifman Mark, Cheung Kei-Hoi, Miller Perry L, Williams Kenneth

机构信息

Yale University, Department of Computer Science, P.O. Box 208285, New Haven, Connecticut 06520-8285, USA.

出版信息

J Proteome Res. 2008 Jan;7(1):293-9. doi: 10.1021/pr0701198. Epub 2007 Sep 29.

Abstract

The widespread use of mass spectrometry for protein identification has created a demand for computationally efficient methods of matching mass spectrometry data to protein databases. A search using X!Tandem, a popular and representative program, can require hours or days to complete, particularly when missed cleavages and post-translational modifications are considered. Existing techniques for accelerating X!Tandem by employing parallelism are unsatisfactory for a variety of reasons. The paper describes a parallelization of X!Tandem, called X!!Tandem, that shows excellent speedups on commodity hardware and produces the same results as the original program. Furthermore, the parallelization technique used is unusual and potentially useful for parallelizing other complex programs.

摘要

质谱法在蛋白质鉴定中的广泛应用,引发了对将质谱数据与蛋白质数据库进行匹配的高效计算方法的需求。使用X!Tandem(一个流行且具有代表性的程序)进行搜索可能需要数小时甚至数天才能完成,尤其是在考虑到酶切不完全和翻译后修饰的情况下。现有的通过并行化来加速X!Tandem的技术,由于各种原因并不令人满意。本文描述了一种X!Tandem的并行化版本,称为X!!Tandem,它在商用硬件上显示出出色的加速效果,并且产生与原始程序相同的结果。此外,所使用的并行化技术不同寻常,可能对其他复杂程序的并行化有用。

相似文献

1
X!!Tandem, an improved method for running X!tandem in parallel on collections of commodity computers.
J Proteome Res. 2008 Jan;7(1):293-9. doi: 10.1021/pr0701198. Epub 2007 Sep 29.
2
Protein Identification from Tandem Mass Spectra by Database Searching.
Methods Mol Biol. 2017;1558:357-380. doi: 10.1007/978-1-4939-6783-4_17.
3
PeaksPTM: Mass spectrometry-based identification of peptides with unspecified modifications.
J Proteome Res. 2011 Jul 1;10(7):2930-6. doi: 10.1021/pr200153k. Epub 2011 May 24.
6
Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry.
Anal Chem. 2007 Feb 15;79(4):1393-400. doi: 10.1021/ac0617013. Epub 2007 Jan 23.
7
Software eyes for protein post-translational modifications.
Mass Spectrom Rev. 2015 Mar-Apr;34(2):133-47. doi: 10.1002/mas.21425. Epub 2014 Jun 2.
8
SearchGUI: An open-source graphical user interface for simultaneous OMSSA and X!Tandem searches.
Proteomics. 2011 Mar;11(5):996-9. doi: 10.1002/pmic.201000595. Epub 2011 Jan 31.
9
Cloud parallel processing of tandem mass spectrometry based proteomics data.
J Proteome Res. 2012 Oct 5;11(10):5101-8. doi: 10.1021/pr300561q. Epub 2012 Sep 5.

引用本文的文献

1
Differences in Uniquely Identified Peptides Between ddaPASEF and diaPASEF.
Cells. 2024 Nov 7;13(22):1848. doi: 10.3390/cells13221848.
2
The hGID E3 ubiquitin ligase complex targets ARHGAP11A to regulate cell migration.
Life Sci Alliance. 2024 Oct 10;7(12). doi: 10.26508/lsa.202403046. Print 2024 Dec.
3
The SysteMHC Atlas v2.0, an updated resource for mass spectrometry-based immunopeptidomics.
Nucleic Acids Res. 2024 Jan 5;52(D1):D1062-D1071. doi: 10.1093/nar/gkad1068.
5
Seeding the aggregation of TDP-43 requires post-fibrillization proteolytic cleavage.
Nat Neurosci. 2023 Jun;26(6):983-996. doi: 10.1038/s41593-023-01341-4. Epub 2023 May 29.
8
Considerations for constructing a protein sequence database for metaproteomics.
Comput Struct Biotechnol J. 2022 Jan 21;20:937-952. doi: 10.1016/j.csbj.2022.01.018. eCollection 2022.
9
Communication Lower-Bounds for Distributed-Memory Computations for Mass Spectrometry based Omics Data.
J Parallel Distrib Comput. 2022 Mar;161:37-47. doi: 10.1016/j.jpdc.2021.11.001. Epub 2021 Nov 17.
10
High Performance Computing Framework for Tera-Scale Database Search of Mass Spectrometry Data.
Nat Comput Sci. 2021 Aug;1(8):550-561. doi: 10.1038/s43588-021-00113-z. Epub 2021 Aug 20.

本文引用的文献

1
An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.
J Am Soc Mass Spectrom. 1994 Nov;5(11):976-89. doi: 10.1016/1044-0305(94)80016-2.
2
General framework for developing and evaluating database scoring algorithms using the TANDEM search engine.
Bioinformatics. 2006 Nov 15;22(22):2830-2. doi: 10.1093/bioinformatics/btl379. Epub 2006 Jul 28.
4
A high productivity/low maintenance approach to high-performance computation for biomedicine: four case studies.
J Am Med Inform Assoc. 2005 Jan-Feb;12(1):90-8. doi: 10.1197/jamia.M1571. Epub 2004 Oct 18.
5
The International Protein Index: an integrated database for proteomics experiments.
Proteomics. 2004 Jul;4(7):1985-8. doi: 10.1002/pmic.200300721.
6
TANDEM: matching proteins with tandem mass spectra.
Bioinformatics. 2004 Jun 12;20(9):1466-7. doi: 10.1093/bioinformatics/bth092. Epub 2004 Feb 19.
7
A method for reducing the time required to match protein sequences with tandem mass spectra.
Rapid Commun Mass Spectrom. 2003;17(20):2310-6. doi: 10.1002/rcm.1198.
8
Probability-based protein identification by searching sequence databases using mass spectrometry data.
Electrophoresis. 1999 Dec;20(18):3551-67. doi: 10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验