PhoStar：在数据库搜索前鉴定磷酸化肽的串联质谱

PhoStar: Identifying Tandem Mass Spectra of Phosphorylated Peptides before Database Search.

机构信息

University of Applied Sciences Upper Austria , Bioinformatics Research Group, Softwarepark 11, 4232 Hagenberg, Austria.

Research Institute of Molecular Pathology (IMP) , Protein Chemistry, Campus-Vienna-Biocenter 1, 1030 Vienna, Austria.

出版信息

J Proteome Res. 2018 Jan 5;17(1):290-295. doi: 10.1021/acs.jproteome.7b00563. Epub 2017 Nov 2.

DOI:10.1021/acs.jproteome.7b00563

PMID:29057658

Abstract

Standard proteomics workflows use tandem mass spectrometry followed by sequence database search to analyze complex biological samples. The identification of proteins carrying post-translational modifications, for example, phosphorylation, is typically addressed by allowing variable modifications in the searched sequences. Accounting for these variations exponentially increases the combinatorial space in the database, which leads to increased processing times and more false positive identifications. The here-presented tool PhoStar identifies spectra that originate from phosphorylated peptides before database search using a supervised machine learning approach. The model for the prediction of phosphorylation was trained and validated with an accuracy of 97.6% on a large set of high-confidence spectra collected from publicly available experimental data. Its power was further validated by predicting phosphorylation in the complete NIST human and mouse high collision-dissociation spectral libraries, achieving an accuracy of 98.2 and 97.9%, respectively. We demonstrate the application of PhoStar by using it for spectra filtering before database search. In database search of HeLa samples the peptide search space was reduced by 27-66% while finding at least 97% of total peptide identifications (at 1% FDR) compared with a standard workflow.

摘要

标准蛋白质组学工作流程使用串联质谱法，然后通过序列数据库搜索来分析复杂的生物样本。例如，对翻译后修饰（如磷酸化）的蛋白质的鉴定通常通过在搜索序列中允许可变修饰来解决。考虑到这些变化会使数据库中的组合空间呈指数级增长，从而导致处理时间增加和更多的假阳性鉴定。这里介绍的 PhoStar 工具使用监督机器学习方法在数据库搜索之前识别源自磷酸化肽的光谱。用于预测磷酸化的模型在一组从公开可用的实验数据中收集的高可信度光谱上进行了训练和验证，其准确性达到了 97.6%。通过预测 NIST 人类和小鼠高碰撞解离光谱库中的磷酸化，进一步验证了其功能，准确率分别达到了 98.2%和 97.9%。我们通过在数据库搜索之前使用 PhoStar 进行光谱过滤来展示其应用。在 HeLa 样本的数据库搜索中，与标准工作流程相比，肽搜索空间减少了 27-66%，同时找到了至少 97%的总肽鉴定（在 1% FDR 下）。

相似文献

PhoStar: Identifying Tandem Mass Spectra of Phosphorylated Peptides before Database Search.PhoStar：在数据库搜索前鉴定磷酸化肽的串联质谱

J Proteome Res. 2018 Jan 5;17(1):290-295. doi: 10.1021/acs.jproteome.7b00563. Epub 2017 Nov 2.

Automatic validation of phosphopeptide identifications from tandem mass spectra.串联质谱中磷酸化肽段鉴定的自动验证

Anal Chem. 2007 Feb 15;79(4):1301-10. doi: 10.1021/ac061334v.

Effective Leveraging of Targeted Search Spaces for Improving Peptide Identification in Tandem Mass Spectrometry Based Proteomics.有效利用靶向搜索空间以改善基于串联质谱的蛋白质组学中的肽段鉴定

J Proteome Res. 2015 Dec 4;14(12):5169-78. doi: 10.1021/acs.jproteome.5b00504. Epub 2015 Nov 24.

Automatic validation of phosphopeptide identifications by the MS2/MS3 target-decoy search strategy.通过MS2/MS3目标-诱饵搜索策略对磷酸化肽段鉴定进行自动验证。

J Proteome Res. 2008 Apr;7(4):1640-9. doi: 10.1021/pr700675j. Epub 2008 Mar 4.

Prophossi: automating expert validation of phosphopeptide-spectrum matches from tandem mass spectrometry.Prophossi：自动化磷酸肽谱匹配的专家验证，源自串联质谱技术。

Bioinformatics. 2010 Sep 1;26(17):2153-9. doi: 10.1093/bioinformatics/btq341. Epub 2010 Jul 22.

Large-Scale Reanalysis of Publicly Available HeLa Cell Proteomics Data in the Context of the Human Proteome Project.大规模重新分析人类蛋白质组计划背景下公开可用的 HeLa 细胞蛋白质组学数据。

J Proteome Res. 2018 Dec 7;17(12):4160-4170. doi: 10.1021/acs.jproteome.8b00392. Epub 2018 Sep 17.

Doubling down on phosphorylation as a variable peptide modification.将磷酸化作为一种可变肽修饰进一步强化。

Proteomics. 2016 Sep;16(18):2444-7. doi: 10.1002/pmic.201500440. Epub 2016 Jun 23.

Protein Identification from Tandem Mass Spectra by Database Searching.通过数据库搜索从串联质谱中鉴定蛋白质。

Methods Mol Biol. 2017;1558:357-380. doi: 10.1007/978-1-4939-6783-4_17.

Expanding tandem mass spectral libraries of phosphorylated peptides: advances and applications.扩展磷酸化肽串联质谱文库：进展与应用。

J Proteome Res. 2013 Dec 6;12(12):5971-7. doi: 10.1021/pr4007443. Epub 2013 Oct 29.

Tandem Mass Spectrum Identification via Cascaded Search.通过级联搜索进行串联质谱鉴定

J Proteome Res. 2015 Aug 7;14(8):3027-38. doi: 10.1021/pr501173s. Epub 2015 Jun 30.

引用本文的文献

Detecting diagnostic features in MS/MS spectra of post-translationally modified peptides.检测经翻译后修饰肽的 MS/MS 谱中的诊断特征。

Nat Commun. 2023 Jul 12;14(1):4132. doi: 10.1038/s41467-023-39828-0.

Advances, obstacles, and opportunities for machine learning in proteomics.蛋白质组学中机器学习的进展、障碍与机遇

Cell Rep Phys Sci. 2022 Oct 19;3(10). doi: 10.1016/j.xcrp.2022.101069. Epub 2022 Sep 22.

Selenium-Encoded Isotopic Signature Targeted Profiling.硒编码同位素特征靶向分析

ACS Cent Sci. 2018 Aug 22;4(8):960-970. doi: 10.1021/acscentsci.8b00112. Epub 2018 Jul 16.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

PhoStar：在数据库搜索前鉴定磷酸化肽的串联质谱

PhoStar: Identifying Tandem Mass Spectra of Phosphorylated Peptides before Database Search.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献