Institute of Computing Technology and Key Lab of Intelligent Information Processing, Chinese Academy of Sciences, Beijing 100190, China.
Mol Cell Proteomics. 2011 May;10(5):M110.000455. doi: 10.1074/mcp.M110.000455. Epub 2011 Feb 14.
Identification of proteins and their modifications via liquid chromatography-tandem mass spectrometry is an important task for the field of proteomics. However, because of the complexity of tandem mass spectra, the majority of the spectra cannot be identified. The presence of unanticipated protein modifications is among the major reasons for the low spectral identification rate. The conventional database search approach to protein identification has inherent difficulties in comprehensive detection of protein modifications. In recent years, increasing efforts have been devoted to developing unrestrictive approaches to modification identification, but they often suffer from their lack of speed. This paper presents a statistical algorithm named DeltAMT (Delta Accurate Mass and Time) for fast detection of abundant protein modifications from tandem mass spectra with high-accuracy precursor masses. The algorithm is based on the fact that the modified and unmodified versions of a peptide are usually present simultaneously in a sample and their spectra are correlated with each other in precursor masses and retention times. By representing each pair of spectra as a delta mass and time vector, bivariate Gaussian mixture models are used to detect modification-related spectral pairs. Unlike previous approaches to unrestrictive modification identification that mainly rely upon the fragment information and the mass dimension in liquid chromatography-tandem mass spectrometry, the proposed algorithm makes the most of precursor information. Thus, it is highly efficient while being accurate and sensitive. On two published data sets, the algorithm effectively detected various modifications and other interesting events, yielding deep insights into the data. Based on these discoveries, the spectral identification rates were significantly increased and many modified peptides were identified.
通过液相色谱-串联质谱法鉴定蛋白质及其修饰物是蛋白质组学领域的一项重要任务。然而,由于串联质谱的复杂性,大多数谱图无法被识别。未预料到的蛋白质修饰的存在是导致光谱鉴定率低的主要原因之一。传统的数据库搜索方法在蛋白质修饰的全面检测方面存在固有困难。近年来,人们越来越致力于开发无限制的修饰鉴定方法,但它们往往因缺乏速度而受到限制。本文提出了一种名为 DeltAMT(Delta 精确质量和时间)的统计算法,用于从具有高精度前体质量的串联质谱中快速检测丰富的蛋白质修饰。该算法基于这样一个事实,即肽的修饰和未修饰版本通常同时存在于样品中,并且它们的谱图在前体质量和保留时间上相互关联。通过将每对谱图表示为一个 delta 质量和时间向量,使用双变量高斯混合模型来检测与修饰相关的谱图对。与主要依赖于液相色谱-串联质谱中的碎片信息和质量维度的无限制修饰鉴定方法不同,所提出的算法充分利用了前体信息。因此,它既高效又准确和灵敏。在两个已发表的数据集上,该算法有效地检测了各种修饰和其他有趣的事件,深入了解了数据。基于这些发现,光谱鉴定率显著提高,许多修饰肽被鉴定出来。