Suppr超能文献

滤器:一种基于概率的支持向量机算法,用于在数据库搜索之前自动筛选磷酸化肽段的CID光谱。

Colander: a probability-based support vector machine algorithm for automatic screening for CID spectra of phosphopeptides prior to database search.

作者信息

Lu Bingwen, Ruse Cristian I, Yates John R

机构信息

Department of Chemical Physiology, SR-11, The Scripps Research Institute, La Jolla, CA 92037, USA.

出版信息

J Proteome Res. 2008 Aug;7(8):3628-34. doi: 10.1021/pr8001194. Epub 2008 Jun 19.

Abstract

We developed a probability-based machine-learning program, Colander, to identify tandem mass spectra that are highly likely to represent phosphopeptides prior to database search. We identified statistically significant diagnostic features of phosphopeptide tandem mass spectra based on ion trap CID MS/MS experiments. Statistics for the features are calculated from 376 validated phosphopeptide spectra and 376 nonphosphopeptide spectra. A probability-based support vector machine (SVM) program, Colander, was then trained on five selected features. Data sets were assembled both from LC/LC-MS/MS analyses of large-scale phosphopeptide enrichments from proteolyzed cells, tissues and synthetic phosphopeptides. These data sets were used to evaluate the capability of Colander to select pS/pT-containing phosphopeptide tandem mass spectra. When applied to unknown tandem mass spectra, Colander can routinely remove 80% of tandem mass spectra while retaining 95% of phosphopeptide tandem mass spectra. The program significantly reduced computational time spent on database search by 60-90%. Furthermore, prefiltering tandem mass spectra representing phosphopeptides can increase the number of phosphopeptide identifications under a predefined false positive rate.

摘要

我们开发了一种基于概率的机器学习程序Colander,用于在数据库搜索之前识别极有可能代表磷酸化肽段的串联质谱图。我们基于离子阱CID MS/MS实验确定了磷酸化肽段串联质谱图具有统计学意义的诊断特征。这些特征的统计数据是根据376个经过验证的磷酸化肽段谱图和376个非磷酸化肽段谱图计算得出的。然后,基于概率的支持向量机(SVM)程序Colander在五个选定的特征上进行了训练。数据集既来自对蛋白水解的细胞、组织和合成磷酸化肽段进行大规模磷酸化肽段富集的LC/LC-MS/MS分析。这些数据集用于评估Colander选择含pS/pT磷酸化肽段串联质谱图的能力。当应用于未知的串联质谱图时,Colander通常可以去除80%的串联质谱图,同时保留95%的磷酸化肽段串联质谱图。该程序显著减少了数据库搜索所花费的计算时间,减少了60%至90%。此外,对代表磷酸化肽段的串联质谱图进行预过滤可以在预定义的假阳性率下增加磷酸化肽段鉴定的数量。

相似文献

2
Automatic validation of phosphopeptide identifications from tandem mass spectra.
Anal Chem. 2007 Feb 15;79(4):1301-10. doi: 10.1021/ac061334v.
3
Reference-facilitated phosphoproteomics: fast and reliable phosphopeptide validation by microLC-ESI-Q-TOF MS/MS.
Mol Cell Proteomics. 2007 Aug;6(8):1380-91. doi: 10.1074/mcp.M600480-MCP200. Epub 2007 May 17.
4
Phosphorylation-specific MS/MS scoring for rapid and accurate phosphoproteome analysis.
J Proteome Res. 2008 Aug;7(8):3373-81. doi: 10.1021/pr800129m. Epub 2008 Jun 19.
5
Automatic validation of phosphopeptide identifications by the MS2/MS3 target-decoy search strategy.
J Proteome Res. 2008 Apr;7(4):1640-9. doi: 10.1021/pr700675j. Epub 2008 Mar 4.
7
Prophossi: automating expert validation of phosphopeptide-spectrum matches from tandem mass spectrometry.
Bioinformatics. 2010 Sep 1;26(17):2153-9. doi: 10.1093/bioinformatics/btq341. Epub 2010 Jul 22.
8
PhosphoScore: an open-source phosphorylation site assignment tool for MSn data.
J Proteome Res. 2008 Jul;7(7):3054-9. doi: 10.1021/pr800169k. Epub 2008 Jun 11.
9
Fragment Mass Spectrum Prediction Facilitates Site Localization of Phosphorylation.
J Proteome Res. 2021 Jan 1;20(1):634-644. doi: 10.1021/acs.jproteome.0c00580. Epub 2020 Oct 8.
10
Estimating the Efficiency of Phosphopeptide Identification by Tandem Mass Spectrometry.
J Am Soc Mass Spectrom. 2017 Jun;28(6):1127-1135. doi: 10.1007/s13361-017-1603-5. Epub 2017 Mar 10.

引用本文的文献

2
High mass accuracy phosphopeptide identification using tandem mass spectra.
Int J Proteomics. 2012;2012:104681. doi: 10.1155/2012/104681. Epub 2012 Jul 15.
3
Recommendations for mass spectrometry data quality metrics for open access data (corollary to the Amsterdam Principles).
J Proteome Res. 2012 Feb 3;11(2):1412-9. doi: 10.1021/pr201071t. Epub 2011 Dec 8.
4
Recommendations for mass spectrometry data quality metrics for open access data (corollary to the Amsterdam Principles).
Mol Cell Proteomics. 2011 Dec;10(12):O111.015446. doi: 10.1074/mcp.O111.015446. Epub 2011 Nov 3.
5
Toward a complete in silico, multi-layered embryonic stem cell regulatory network.
Wiley Interdiscip Rev Syst Biol Med. 2010 Nov-Dec;2(6):708-33. doi: 10.1002/wsbm.93.

本文引用的文献

1
An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.
J Am Soc Mass Spectrom. 1994 Nov;5(11):976-89. doi: 10.1016/1044-0305(94)80016-2.
2
Moderating the outputs of support vector machine classifiers.
IEEE Trans Neural Netw. 1999;10(5):1018-31. doi: 10.1109/72.788642.
3
Decoding protein modifications using top-down mass spectrometry.
Nat Methods. 2007 Oct;4(10):817-21. doi: 10.1038/nmeth1097.
5
Comprehensive phosphorylation site analysis of individual phosphoproteins applying scoring schemes for MS/MS data.
Anal Chem. 2007 Oct 1;79(19):7439-49. doi: 10.1021/ac0707784. Epub 2007 Aug 25.
6
On studying protein phosphorylation patterns using bottom-up LC-MS/MS: the case of human alpha-casein.
Analyst. 2007 Aug;132(8):768-76. doi: 10.1039/b701902e. Epub 2007 Jun 13.
7
Reference-facilitated phosphoproteomics: fast and reliable phosphopeptide validation by microLC-ESI-Q-TOF MS/MS.
Mol Cell Proteomics. 2007 Aug;6(8):1380-91. doi: 10.1074/mcp.M600480-MCP200. Epub 2007 May 17.
8
Mining phosphopeptide signals in liquid chromatography-mass spectrometry data for protein phosphorylation analysis.
J Proteome Res. 2007 May;6(5):1812-21. doi: 10.1021/pr060631d. Epub 2007 Apr 3.
10
Automatic validation of phosphopeptide identifications from tandem mass spectra.
Anal Chem. 2007 Feb 15;79(4):1301-10. doi: 10.1021/ac061334v.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验