Wan Yunhu, Yang Austin, Chen Ting
Department of Mathematics, University of Southern California, Los Angeles, California 90089, USA.
Anal Chem. 2006 Jan 15;78(2):432-7. doi: 10.1021/ac051319a.
An accurate scoring function for database search is crucial for peptide identification using tandem mass spectrometry. Although many mathematical models have been proposed to score peptides against tandem mass spectra, our method (called PepHMM, http://msms.cmb.usc.edu) is unique in that it combines information on machine accuracy, mass peak intensity, and correlation among ions into a hidden Markov model (HMM). In addition, we develop a method to calculate statistical significance of the HMM scores. We implement the method and test them on two sets of experimental data generated by two different types of mass spectrometers and compare the results with MASCOT and SEQUEST under the same condition. One experimental results show that PepHMM has a much higher accuracy (with 6.5% error rate) than MASCOT (with 17.4% error rate), and the other experimental results show that PepHMM identifies 43 and 31% more correct spectra than SEQUEST and MASCOT, respectively.
用于数据库搜索的精确评分函数对于使用串联质谱法进行肽段鉴定至关重要。尽管已经提出了许多数学模型来对肽段与串联质谱进行评分,但我们的方法(称为PepHMM,http://msms.cmb.usc.edu)具有独特之处,它将机器精度、质量峰强度以及离子间的相关性信息整合到一个隐马尔可夫模型(HMM)中。此外,我们开发了一种计算HMM评分统计显著性的方法。我们实现了该方法,并在由两种不同类型质谱仪生成的两组实验数据上进行测试,在相同条件下将结果与MASCOT和SEQUEST进行比较。一组实验结果表明,PepHMM的准确率(错误率为6.5%)比MASCOT(错误率为17.4%)高得多,另一组实验结果表明,PepHMM分别比SEQUEST和MASCOT多识别出43%和31%的正确谱图。