Institute of Computing Technology and Key Lab of Intelligent Information Processing, Chinese Academy of Sciences, Beijing 100190, China.
Bioinformatics. 2010 Jun 15;26(12):i399-406. doi: 10.1093/bioinformatics/btq185.
Identification of post-translationally modified proteins has become one of the central issues of current proteomics. Spectral library search is a new and promising computational approach to mass spectrometry-based protein identification. However, its potential in identification of unanticipated post-translational modifications has rarely been explored. The existing spectral library search tools are designed to match the query spectrum to the reference library spectra with the same peptide mass. Thus, spectra of peptides with unanticipated modifications cannot be identified.
In this article, we present an open spectral library search tool, named pMatch. It extends the existing library search algorithms in at least three aspects to support the identification of unanticipated modifications. First, the spectra in library are optimized with the full peptide sequence information to better tolerate the peptide fragmentation pattern variations caused by some modification(s). Second, a new scoring system is devised, which uses charge-dependent mass shifts for peak matching and combines a probability-based model with the general spectral dot-product for scoring. Third, a target-decoy strategy is used for false discovery rate control. To demonstrate the effectiveness of pMatch, a library search experiment was conducted on a public dataset with over 40,000 spectra in comparison with SpectraST, the most popular library search engine. Additional validations were done on four published datasets including over 150,000 spectra. The results showed that pMatch can effectively identify unanticipated modifications and significantly increase spectral identification rate.
http://pfind.ict.ac.cn/pmatch/.
Supplementary data are available at Bioinformatics online.
翻译后修饰蛋白的鉴定已成为当前蛋白质组学的核心问题之一。谱库检索是一种新的有前途的基于质谱的蛋白质鉴定计算方法。然而,它在鉴定意外的翻译后修饰方面的潜力很少被探索。现有的谱库检索工具旨在将查询谱与具有相同肽质量的参考库谱相匹配。因此,无法识别具有意外修饰的肽的谱。
在本文中,我们提出了一种开放的谱库检索工具,命名为 pMatch。它至少在三个方面扩展了现有的库搜索算法,以支持意外修饰的鉴定。首先,使用完整的肽序列信息对库中的谱进行优化,以更好地容忍某些修饰引起的肽片段模式变化。其次,设计了一种新的评分系统,该系统使用电荷依赖性质量位移进行峰匹配,并将基于概率的模型与通用谱点积相结合进行评分。第三,使用目标诱饵策略进行错误发现率控制。为了证明 pMatch 的有效性,我们在一个包含超过 40000 个谱的公共数据集上进行了库搜索实验,并与最流行的库搜索引擎 SpectraST 进行了比较。在包括超过 150000 个谱的四个已发表数据集上进行了额外的验证。结果表明,pMatch 可以有效地识别意外的修饰,并显著提高谱识别率。
http://pfind.ict.ac.cn/pmatch/。
补充数据可在生物信息学在线获得。