Key Laboratory of Intelligent Information Processing-Institute of Computing Technology, Chinese Academy of Sciences, Beijing, P R China.
Proteomics. 2012 Jan;12(2):226-35. doi: 10.1002/pmic.201100081. Epub 2011 Dec 20.
Determining the monoisotopic peak of a precursor is a first step in interpreting mass spectra, which is basic but non-trivial. The reason is that in the isolation window of a precursor, other peaks interfere with the determination of the monoisotopic peak, leading to wrong mass-to-charge ratio or charge state. Here we propose a method, named pParse, to export the most probable monoisotopic peaks for precursors, including co-eluted precursors. We use the relationship between the position of the highest peak and the mass of the first peak to detect candidate clusters. Then, we extract three features to sort the candidate clusters: (i) the sum of the intensity, (ii) the similarity of the experimental and the theoretical isotopic distribution, and (iii) the similarity of elution profiles. We showed that the recall of pParse, MaxQuant, and BioWorks was 98-98.8%, 0.5-17%, and 1.8-36.5% at the same precision, respectively. About 50% of tandem mass spectra are triggered by multiple precursors which are difficult to identify. Then we design a new scoring function to identify the co-eluted precursors. About 26% of all identified peptides were exclusively from co-eluted peptides. Therefore, accurately determining monoisotopic peaks, including co-eluted precursors, can greatly increase peptide identification rate.
确定前体的单一同位素峰是解释质谱的第一步,这是基本但并非微不足道的。原因是在前体的隔离窗口中,其他峰会干扰单一同位素峰的确定,导致错误的质荷比或电荷状态。在这里,我们提出了一种名为 pParse 的方法,用于导出最可能的前体单一同位素峰,包括共洗脱前体。我们使用最高峰的位置与第一个峰的质量之间的关系来检测候选簇。然后,我们提取三个特征来对候选簇进行排序:(i)强度总和,(ii)实验和理论同位素分布的相似性,以及(iii)洗脱曲线的相似性。我们表明,在相同的精度下,pParse、MaxQuant 和 BioWorks 的召回率分别为 98-98.8%、0.5-17%和 1.8-36.5%。大约 50%的串联质谱是由多个难以识别的前体触发的。然后,我们设计了一种新的评分函数来识别共洗脱前体。大约 26%的所有鉴定肽仅来自共洗脱肽。因此,准确确定单一同位素峰,包括共洗脱前体,可以大大提高肽鉴定率。