School of Informatics and Computing, Indiana University, Bloomington, Indiana 47408, USA.
Anal Chem. 2011 Feb 1;83(3):790-6. doi: 10.1021/ac102272r. Epub 2010 Dec 22.
We estimated the reproducibility of tandem mass spectra for the widely used collision-induced dissociation (CID) of peptide ions. Using the Pearson correlation coefficient as a measure of spectral similarity, we found that the within-experiment reproducibility of fragment ion intensities is very high (about 0.85). However, across different experiments and instrument types/setups, the correlation decreases by more than 15% (to about 0.70). We further investigated the accuracy of current predictors of peptide fragmentation spectra and found that they are more accurate than the ad-hoc models generally used by search engines (e.g., SEQUEST) and, surprisingly, approaching the empirical upper limit set by the average across-experiment spectral reproducibility (especially for charge +1 and charge +2 precursor ions). These results provide evidence that, in terms of accuracy of modeling, predicted peptide fragmentation spectra provide a viable alternative to spectral libraries for peptide identification, with a higher coverage of peptides and lower storage requirements. Furthermore, using five data sets of proteome digests by two different proteases, we find that PeptideART (a data-driven machine learning approach) is generally more accurate than MassAnalyzer (an approach based on a kinetic model for peptide fragmentation) in predicting fragmentation spectra but that both models are significantly more accurate than the ad-hoc models.
我们评估了广泛使用的肽离子碰撞诱导解离(CID)串联质谱的重现性。使用 Pearson 相关系数作为衡量光谱相似性的指标,我们发现片段离子强度的实验内重现性非常高(约为 0.85)。然而,在不同的实验和仪器类型/设置中,相关性下降超过 15%(降至约 0.70)。我们进一步研究了当前肽片段谱预测器的准确性,发现它们比搜索引擎(例如 SEQUEST)通常使用的特定模型更准确,并且令人惊讶的是,它们接近由跨实验光谱重现性平均值设定的经验上限(尤其是对于电荷 +1 和电荷 +2 前体离子)。这些结果表明,就建模准确性而言,预测的肽片段谱为肽鉴定提供了一种可行的谱库替代方案,具有更高的肽覆盖率和更低的存储要求。此外,使用两种不同蛋白酶的五个蛋白质组消化物数据集,我们发现 PeptideART(一种基于数据的机器学习方法)通常比 MassAnalyzer(一种基于肽片段动力学模型的方法)在预测片段谱方面更准确,但这两种模型都明显比特定模型更准确。