Ma Bin, Zhang Kaizhong, Hendrie Christopher, Liang Chengzhi, Li Ming, Doherty-Kirby Amanda, Lajoie Gilles
Department of Computer Science, University of Western Ontario, London, ON N6A 5B7, Canada.
Rapid Commun Mass Spectrom. 2003;17(20):2337-42. doi: 10.1002/rcm.1196.
A number of different approaches have been described to identify proteins from tandem mass spectrometry (MS/MS) data. The most common approaches rely on the available databases to match experimental MS/MS data. These methods suffer from several drawbacks and cannot be used for the identification of proteins from unknown genomes. In this communication, we describe a new de novo sequencing software package, PEAKS, to extract amino acid sequence information without the use of databases. PEAKS uses a new model and a new algorithm to efficiently compute the best peptide sequences whose fragment ions can best interpret the peaks in the MS/MS spectrum. The output of the software gives amino acid sequences with confidence scores for the entire sequences, as well as an additional novel positional scoring scheme for portions of the sequences. The performance of PEAKS is compared with Lutefisk, a well-known de novo sequencing software, using quadrupole-time-of-flight (Q-TOF) data obtained for several tryptic peptides from standard proteins.
已经描述了许多不同的方法来从串联质谱(MS/MS)数据中鉴定蛋白质。最常见的方法是依靠现有的数据库来匹配实验性MS/MS数据。这些方法存在若干缺点,并且不能用于从未知基因组中鉴定蛋白质。在本通讯中,我们描述了一种新的从头测序软件包PEAKS,它无需使用数据库即可提取氨基酸序列信息。PEAKS使用一种新模型和新算法来高效计算最佳肽序列,其碎片离子能够最佳地解释MS/MS谱图中的峰。该软件的输出给出了具有整个序列置信度得分的氨基酸序列,以及针对部分序列的另一种新颖的位置评分方案。使用从标准蛋白质的几种胰蛋白酶肽获得的四极杆飞行时间(Q-TOF)数据,将PEAKS的性能与著名的从头测序软件Lutefisk进行了比较。