串联质谱中磷酸化肽段鉴定的自动验证

Automatic validation of phosphopeptide identifications from tandem mass spectra.

作者信息

Lu Bingwen, Ruse Cristian, Xu Tao, Park Sung Kyu, Yates John

机构信息

Department of Cell Biology, The Scripps Research Institute, La Jolla, California 92037, USA.

出版信息

Anal Chem. 2007 Feb 15;79(4):1301-10. doi: 10.1021/ac061334v.

DOI:10.1021/ac061334v

PMID:17297928

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2527591/

Abstract

We developed and compared two approaches for automated validation of phosphopeptide tandem mass spectra identified using database searching algorithms. Phosphopeptide identifications were obtained through SEQUEST searches of a protein database appended with its decoy (reversed sequences). Statistical evaluation and iterative searches were employed to create a high-quality data set of phosphopeptides. Automation of postsearch validation was approached by two different strategies. By using statistical multiple testing, we calculate a p value for each tentative peptide phosphorylation. In a second method, we use a support vector machine (SVM; a machine learning algorithm) binary classifier to predict whether a tentative peptide phosphorylation is true. We show good agreement (85%) between postsearch validation of phosphopeptide/spectrum matches by multiple testing and that from support vector machines. Automatic methods conform very well with manual expert validation in a blinded test. Additionally, the algorithms were tested on the identification of synthetic phosphopeptides. We show that phosphate neutral losses in tandem mass spectra can be used to assess the correctness of phosphopeptide/spectrum matches. An SVM classifier with a radial basis function provided classification accuracy from 95.7% to 96.8% of the positive data set, depending on search algorithm used. Establishing the efficacy of an identification is a necessary step for further postsearch interrogation of the spectra for complete localization of phosphorylation sites. Our current implementation performs validation of phosphoserine/phosphothreonine-containing peptides having one or two phosphorylation sites from data gathered on an ion trap mass spectrometer. The SVM-based algorithm has been implemented in the software package DeBunker. We illustrate the application of the SVM-based software DeBunker on a large phosphorylation data set.

摘要

我们开发并比较了两种用于自动验证通过数据库搜索算法鉴定的磷酸化肽串联质谱的方法。通过对附加了诱饵（反向序列）的蛋白质数据库进行SEQUEST搜索来获得磷酸化肽鉴定结果。采用统计评估和迭代搜索来创建高质量的磷酸化肽数据集。通过两种不同策略实现搜索后验证的自动化。利用统计多重检验，我们为每个暂定的肽磷酸化计算一个p值。在第二种方法中，我们使用支持向量机（SVM；一种机器学习算法）二元分类器来预测暂定的肽磷酸化是否正确。我们发现多重检验和支持向量机对磷酸化肽/谱匹配的搜索后验证之间具有良好的一致性（85%）。在盲测中，自动方法与人工专家验证非常吻合。此外，还对合成磷酸化肽的鉴定进行了算法测试。我们表明，串联质谱中的磷酸盐中性丢失可用于评估磷酸化肽/谱匹配的正确性。根据所使用的搜索算法，具有径向基函数的SVM分类器对阳性数据集的分类准确率为95.7%至96.8%。确定鉴定的有效性是进一步对谱进行搜索后询问以实现磷酸化位点完全定位的必要步骤。我们当前的实现对从离子阱质谱仪收集的数据中含一个或两个磷酸化位点的磷酸丝氨酸/磷酸苏氨酸肽进行验证。基于SVM的算法已在软件包DeBunker中实现。我们展示了基于SVM的软件DeBunker在一个大型磷酸化数据集上的应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e333/2527591/777028de8610/nihms61838f1.jpg

相似文献

Automatic validation of phosphopeptide identifications from tandem mass spectra.串联质谱中磷酸化肽段鉴定的自动验证

Anal Chem. 2007 Feb 15;79(4):1301-10. doi: 10.1021/ac061334v.

Automatic validation of phosphopeptide identifications by the MS2/MS3 target-decoy search strategy.通过MS2/MS3目标-诱饵搜索策略对磷酸化肽段鉴定进行自动验证。

J Proteome Res. 2008 Apr;7(4):1640-9. doi: 10.1021/pr700675j. Epub 2008 Mar 4.

Colander: a probability-based support vector machine algorithm for automatic screening for CID spectra of phosphopeptides prior to database search.滤器：一种基于概率的支持向量机算法，用于在数据库搜索之前自动筛选磷酸化肽段的CID光谱。

J Proteome Res. 2008 Aug;7(8):3628-34. doi: 10.1021/pr8001194. Epub 2008 Jun 19.

Prophossi: automating expert validation of phosphopeptide-spectrum matches from tandem mass spectrometry.Prophossi：自动化磷酸肽谱匹配的专家验证，源自串联质谱技术。

Bioinformatics. 2010 Sep 1;26(17):2153-9. doi: 10.1093/bioinformatics/btq341. Epub 2010 Jul 22.

Correction of errors in tandem mass spectrum extraction enhances phosphopeptide identification.串联质谱提取中错误的校正可增强磷酸肽的鉴定。

J Proteome Res. 2013 Dec 6;12(12):5548-57. doi: 10.1021/pr4004486. Epub 2013 Nov 4.

Reference-facilitated phosphoproteomics: fast and reliable phosphopeptide validation by microLC-ESI-Q-TOF MS/MS.参考辅助磷酸化蛋白质组学：通过微液相色谱-电喷雾电离-四极杆-飞行时间串联质谱进行快速可靠的磷酸肽验证

Mol Cell Proteomics. 2007 Aug;6(8):1380-91. doi: 10.1074/mcp.M600480-MCP200. Epub 2007 May 17.

A hierarchical MS2/MS3 database search algorithm for automated analysis of phosphopeptide tandem mass spectra.一种用于磷酸化肽串联质谱自动分析的分层MS2/MS3数据库搜索算法。

Proteomics. 2009 Apr;9(7):1763-70. doi: 10.1002/pmic.200800282.

Importance of manual validation for the identification of phosphopeptides using a linear ion trap mass spectrometer.使用线性离子阱质谱仪进行磷酸肽鉴定时手动验证的重要性。

J Biomol Tech. 2011 Apr;22(1):10-20.

Confident site localization using a simulated phosphopeptide spectral library.使用模拟磷酸肽光谱库进行可靠的位点定位

J Proteome Res. 2015 May 1;14(5):2348-59. doi: 10.1021/acs.jproteome.5b00050. Epub 2015 Mar 27.

PhoStar: Identifying Tandem Mass Spectra of Phosphorylated Peptides before Database Search.PhoStar：在数据库搜索前鉴定磷酸化肽的串联质谱

J Proteome Res. 2018 Jan 5;17(1):290-295. doi: 10.1021/acs.jproteome.7b00563. Epub 2017 Nov 2.

引用本文的文献

Application of Proteomics Technologies in Oil Palm Research.蛋白质组学技术在油棕研究中的应用。

Protein J. 2018 Dec;37(6):473-499. doi: 10.1007/s10930-018-9802-x.

Phosphorylation and Proteasome Recognition of the mRNA-Binding Protein Cth2 Facilitates Yeast Adaptation to Iron Deficiency.mRNA 结合蛋白 Cth2 的磷酸化和蛋白酶体识别促进酵母适应缺铁环境。

mBio. 2018 Sep 18;9(5):e01694-18. doi: 10.1128/mBio.01694-18.

From raw data to biological discoveries: a computational analysis pipeline for mass spectrometry-based proteomics.从原始数据到生物学发现：基于质谱的蛋白质组学的计算分析流程

J Am Soc Mass Spectrom. 2015 Nov;26(11):1820-6. doi: 10.1007/s13361-015-1161-7. Epub 2015 May 22.

PhosphoHunter: An Efficient Software Tool for Phosphopeptide Identification.磷酸化肽段鉴定的高效软件工具：磷酸化肽段猎手

Adv Bioinformatics. 2015;2015:382869. doi: 10.1155/2015/382869. Epub 2015 Jan 12.

In-line separation by capillary electrophoresis prior to analysis by top-down mass spectrometry enables sensitive characterization of protein complexes.在自上而下的质谱分析之前，通过毛细管电泳进行在线分离能够对蛋白质复合物进行灵敏的表征。

J Proteome Res. 2014 Dec 5;13(12):6078-86. doi: 10.1021/pr500971h. Epub 2014 Nov 21.

Protein kinase C-η controls CTLA-4-mediated regulatory T cell function.蛋白激酶 C-η 调控 CTLA-4 介导的调节性 T 细胞功能。

Nat Immunol. 2014 May;15(5):465-72. doi: 10.1038/ni.2866. Epub 2014 Apr 6.

Proteomics in the characterization of adipose dysfunction in obesity.蛋白质组学在肥胖症中脂肪功能障碍特征分析中的应用

Adipocyte. 2012 Jan 1;1(1):25-37. doi: 10.4161/adip.19129.

Protein analysis by shotgun/bottom-up proteomics.通过鸟枪法/自下而上蛋白质组学进行蛋白质分析。

Chem Rev. 2013 Apr 10;113(4):2343-94. doi: 10.1021/cr3003533. Epub 2013 Feb 26.

Unambiguous phosphosite localization using electron-transfer/higher-energy collision dissociation (EThcD).利用电子转移/更高能量碰撞解离（EThcD）进行明确的磷酸化位点定位。

J Proteome Res. 2013 Mar 1;12(3):1520-5. doi: 10.1021/pr301130k. Epub 2013 Feb 7.

MUMAL: multivariate analysis in shotgun proteomics using machine learning techniques.MUMAL：基于机器学习技术的 shotgun 蛋白质组学多元分析。

BMC Genomics. 2012;13 Suppl 5(Suppl 5):S4. doi: 10.1186/1471-2164-13-S5-S4. Epub 2012 Oct 19.

本文引用的文献

An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.一种将肽的串联质谱数据与蛋白质数据库中氨基酸序列相关联的方法。

J Am Soc Mass Spectrom. 1994 Nov;5(11):976-89. doi: 10.1016/1044-0305(94)80016-2.

Global, in vivo, and site-specific phosphorylation dynamics in signaling networks.信号网络中的全局、体内及位点特异性磷酸化动力学

Cell. 2006 Nov 3;127(3):635-48. doi: 10.1016/j.cell.2006.09.026.

A probability-based approach for high-throughput protein phosphorylation analysis and site localization.一种基于概率的高通量蛋白质磷酸化分析及位点定位方法。

Nat Biotechnol. 2006 Oct;24(10):1285-92. doi: 10.1038/nbt1240. Epub 2006 Sep 10.

Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data.用于质谱和微阵列数据的递归支持向量机特征选择与样本分类

BMC Bioinformatics. 2006 Apr 10;7:197. doi: 10.1186/1471-2105-7-197.

PhosTShunter: a fast and reliable tool to detect phosphorylated peptides in liquid chromatography Fourier transform tandem mass spectrometry data sets.PhosTShunter：一种用于在液相色谱傅里叶变换串联质谱数据集检测磷酸化肽段的快速且可靠的工具。

J Proteome Res. 2006 Mar;5(3):659-68. doi: 10.1021/pr0503836.

Peptide charge state determination for low-resolution tandem mass spectra.低分辨率串联质谱的肽电荷态测定

Proc IEEE Comput Syst Bioinform Conf. 2005:175-85. doi: 10.1109/csb.2005.44.

Kinomics: methods for deciphering the kinome.激酶组学：解析激酶组的方法。

Nat Methods. 2005 Jan;2(1):17-25. doi: 10.1038/nmeth731.

Analysis of protein phosphorylation by mass spectrometry.通过质谱分析蛋白质磷酸化

Methods. 2005 Mar;35(3):256-64. doi: 10.1016/j.ymeth.2004.08.017. Epub 2005 Jan 13.

Immunoaffinity profiling of tyrosine phosphorylation in cancer cells.癌细胞中酪氨酸磷酸化的免疫亲和分析。

Nat Biotechnol. 2005 Jan;23(1):94-101. doi: 10.1038/nbt1046. Epub 2004 Dec 12.

Strategies for shotgun identification of post-translational modifications by mass spectrometry.基于质谱的蛋白质翻译后修饰鸟枪法鉴定策略。

J Chromatogr A. 2004 Oct 22;1053(1-2):7-14. doi: 10.1016/j.chroma.2004.06.046.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验