Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas, USA.
Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.
Mol Cell Proteomics. 2021;20:100171. doi: 10.1016/j.mcpro.2021.100171. Epub 2021 Nov 1.
Tandem mass spectrometry (MS/MS)-based phosphoproteomics is a powerful technology for global phosphorylation analysis. However, applying four computational pipelines to a typical mass spectrometry (MS)-based phosphoproteomic dataset from a human cancer study, we observed a large discrepancy among the reported phosphopeptide identification and phosphosite localization results, underscoring a critical need for benchmarking. While efforts have been made to compare performance of computational pipelines using data from synthetic phosphopeptides, evaluations involving real application data have been largely limited to comparing the numbers of phosphopeptide identifications due to the lack of appropriate evaluation metrics. We investigated three deep-learning-derived features as potential evaluation metrics: phosphosite probability, Delta RT, and spectral similarity. Predicted phosphosite probability is computed by MusiteDeep, which provides high accuracy as previously reported; Delta RT is defined as the absolute retention time (RT) difference between RTs observed and predicted by AutoRT; and spectral similarity is defined as the Pearson's correlation coefficient between spectra observed and predicted by pDeep2. Using a synthetic peptide dataset, we found that both Delta RT and spectral similarity provided excellent discrimination between correct and incorrect peptide-spectrum matches (PSMs) both when incorrect PSMs involved wrong peptide sequences and even when incorrect PSMs were caused by only incorrect phosphosite localization. Based on these results, we used all the three deep-learning-derived features as evaluation metrics to compare different computational pipelines on diverse set of phosphoproteomic datasets and showed their utility in benchmarking performance of the pipelines. The benchmark metrics demonstrated in this study will enable users to select computational pipelines and parameters for routine analysis of phosphoproteomics data and will offer guidance for developers to improve computational methods.
串联质谱(MS/MS)为基础的磷酸化蛋白质组学是一种强大的技术,用于全球磷酸化分析。然而,将四个计算管道应用于来自人类癌症研究的典型基于 MS 的磷酸蛋白质组学数据集,我们观察到报道的磷酸肽鉴定和磷酸化位点定位结果之间存在很大差异,这突显了基准测试的重要性。虽然已经努力使用合成磷酸肽数据来比较计算管道的性能,但由于缺乏适当的评估指标,涉及实际应用数据的评估在很大程度上仅限于比较磷酸肽鉴定的数量。我们研究了三个深度学习衍生的特征作为潜在的评估指标:磷酸化位点概率、Delta RT 和谱相似性。磷酸化位点概率是由 MusiteDeep 计算的,如前所述,它提供了高精度;Delta RT 定义为由 AutoRT 观察到和预测的绝对保留时间(RT)之间的绝对 RT 差异;谱相似性定义为由 pDeep2 观察到和预测的谱之间的 Pearson 相关系数。使用合成肽数据集,我们发现,当错误的 PSM 涉及错误的肽序列时,Delta RT 和谱相似性都提供了出色的正确肽谱匹配(PSM)和错误 PSM 之间的区分,甚至当错误的 PSM 仅由错误的磷酸化位点定位引起时也是如此。基于这些结果,我们使用所有三个深度学习衍生的特征作为评估指标,比较了不同的磷酸蛋白质组数据集上的不同计算管道,并展示了它们在基准测试管道性能方面的效用。本研究中使用的基准指标将使用户能够为常规磷酸蛋白质组数据分析选择计算管道和参数,并为开发人员提供改进计算方法的指导。