National Center for Mathematics and Interdisciplinary Sciences, Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China;; School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.
State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China.
Mol Cell Proteomics. 2019 Feb;18(2):391-405. doi: 10.1074/mcp.RA118.000812. Epub 2018 Nov 12.
The open (mass tolerant) search of tandem mass spectra of peptides shows great potential in the comprehensive detection of post-translational modifications (PTMs) in shotgun proteomics. However, this search strategy has not been widely used by the community, and one bottleneck of it is the lack of appropriate algorithms for automated and reliable post-processing of the coarse and error-prone search results. Here we present PTMiner, a software tool for confident filtering and localization of modifications (mass shifts) detected in an open search. After mass-shift-grouped false discovery rate (FDR) control of peptide-spectrum matches (PSMs), PTMiner uses an empirical Bayesian method to localize modifications through iterative learning of the prior probabilities of each type of modification occurring on different amino acids. The performance of PTMiner was evaluated on three data sets, including simulated data, chemically synthesized peptide library data and modified-peptide spiked-in proteome data. The results showed that PTMiner can effectively control the PSM FDR and accurately localize the modification sites. At 1% real false localization rate (FLR), PTMiner localized 93%, 84 and 83% of the modification sites in the three data sets, respectively, far higher than two open search engines we used and an extended version of the Ascore localization algorithm. We then used PTMiner to analyze a draft map of human proteome containing 25 million spectra from 30 tissues, and confidently identified over 1.7 million modified PSMs at 1% FDR and 1% FLR, which provided a system-wide view of both known and unknown PTMs in the human proteome.
串联质谱的开放(容忍质量公差)搜索在 shotgun 蛋白质组学中对翻译后修饰(PTMs)的全面检测具有巨大潜力。然而,这种搜索策略尚未被该领域广泛采用,其瓶颈之一是缺乏适当的算法来对粗粒度和易错的搜索结果进行自动化和可靠的后处理。在此,我们介绍了 PTMiner,这是一种用于在开放搜索中对检测到的修饰(质量位移)进行置信过滤和定位的软件工具。在对肽谱匹配(PSMs)进行质量位移组错误发现率(FDR)控制后,PTMiner 使用经验贝叶斯方法通过对每种类型的修饰在不同氨基酸上发生的先验概率进行迭代学习来定位修饰。我们在三个数据集上评估了 PTMiner 的性能,包括模拟数据、化学合成肽库数据和修饰肽掺入蛋白质组数据。结果表明,PTMiner 可以有效地控制 PSM FDR,并准确地定位修饰位点。在真实假定位率(FLR)为 1%的情况下,PTMiner 在三个数据集中共定位了 93%、84%和 83%的修饰位点,远高于我们使用的两个开放搜索引擎和 Ascore 本地化算法的扩展版本。然后,我们使用 PTMiner 分析了包含来自 30 种组织的 2500 万条谱图的人类蛋白质组草案图,在 FDR 为 1%和 FLR 为 1%的条件下,置信地鉴定了超过 170 万条修饰 PSMs,从而提供了人类蛋白质组中已知和未知 PTMs 的系统全面视图。