Suppr超能文献

PhoStar:在数据库搜索前鉴定磷酸化肽的串联质谱

PhoStar: Identifying Tandem Mass Spectra of Phosphorylated Peptides before Database Search.

机构信息

University of Applied Sciences Upper Austria , Bioinformatics Research Group, Softwarepark 11, 4232 Hagenberg, Austria.

Research Institute of Molecular Pathology (IMP) , Protein Chemistry, Campus-Vienna-Biocenter 1, 1030 Vienna, Austria.

出版信息

J Proteome Res. 2018 Jan 5;17(1):290-295. doi: 10.1021/acs.jproteome.7b00563. Epub 2017 Nov 2.

Abstract

Standard proteomics workflows use tandem mass spectrometry followed by sequence database search to analyze complex biological samples. The identification of proteins carrying post-translational modifications, for example, phosphorylation, is typically addressed by allowing variable modifications in the searched sequences. Accounting for these variations exponentially increases the combinatorial space in the database, which leads to increased processing times and more false positive identifications. The here-presented tool PhoStar identifies spectra that originate from phosphorylated peptides before database search using a supervised machine learning approach. The model for the prediction of phosphorylation was trained and validated with an accuracy of 97.6% on a large set of high-confidence spectra collected from publicly available experimental data. Its power was further validated by predicting phosphorylation in the complete NIST human and mouse high collision-dissociation spectral libraries, achieving an accuracy of 98.2 and 97.9%, respectively. We demonstrate the application of PhoStar by using it for spectra filtering before database search. In database search of HeLa samples the peptide search space was reduced by 27-66% while finding at least 97% of total peptide identifications (at 1% FDR) compared with a standard workflow.

摘要

标准蛋白质组学工作流程使用串联质谱法,然后通过序列数据库搜索来分析复杂的生物样本。例如,对翻译后修饰(如磷酸化)的蛋白质的鉴定通常通过在搜索序列中允许可变修饰来解决。考虑到这些变化会使数据库中的组合空间呈指数级增长,从而导致处理时间增加和更多的假阳性鉴定。这里介绍的 PhoStar 工具使用监督机器学习方法在数据库搜索之前识别源自磷酸化肽的光谱。用于预测磷酸化的模型在一组从公开可用的实验数据中收集的高可信度光谱上进行了训练和验证,其准确性达到了 97.6%。通过预测 NIST 人类和小鼠高碰撞解离光谱库中的磷酸化,进一步验证了其功能,准确率分别达到了 98.2%和 97.9%。我们通过在数据库搜索之前使用 PhoStar 进行光谱过滤来展示其应用。在 HeLa 样本的数据库搜索中,与标准工作流程相比,肽搜索空间减少了 27-66%,同时找到了至少 97%的总肽鉴定(在 1% FDR 下)。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验