Ning Kang, Leong Hon Wai
Department of Computer Science, National University of Singapore, Block S15, 3 Science Drive 2, Singapore 117543, Singapore.
Comput Syst Bioinformatics Conf. 2007;6:19-30.
Peptide sequencing by tandem mass spectrometry is a very important, interesting, yet challenging problem in proteomics. This problem is extensively investigated by researchers recently, and the peptide sequencing results are becoming more and more accurate. However, many of these algorithms are using computational models based on some unverified assumptions. We believe that the investigation of the validity of these assumptions and related problems will lead to improvements in current algorithms. In this paper, we have first investigated peptide sequencing without preprocessing the spectrum, and we have shown that by introducing preprocessing on spectrum, peptide sequencing can be faster, easier and more accurate. We have then investigated one very important problem, the anti-symmetric problem in the peptide sequencing problem, and we have proved by experiments that model that simply ignore anti-symmetric of model that remove all anti-symmetric instances are too simple for peptide sequencing problem. We have proposed a new model for anti-symmetric problem in more realistic way. We have also proposed a novel algorithm which incorporate preprocessing and new model for anti-symmetric issue, and experiments show that this algorithm has better performance on datasets examined.
通过串联质谱进行肽测序是蛋白质组学中一个非常重要、有趣但又具有挑战性的问题。最近研究人员对这个问题进行了广泛研究,肽测序结果也越来越准确。然而,这些算法中的许多都使用基于一些未经证实假设的计算模型。我们认为,对这些假设的有效性及相关问题进行研究将有助于改进当前算法。在本文中,我们首先研究了未经光谱预处理的肽测序,并且表明通过对光谱进行预处理,肽测序可以更快、更容易且更准确。然后我们研究了一个非常重要的问题,即肽测序问题中的反对称问题,并且通过实验证明,简单忽略反对称性的模型或去除所有反对称实例的模型对于肽测序问题来说过于简单。我们以更现实的方式提出了一个针对反对称问题的新模型。我们还提出了一种新颖的算法,该算法结合了预处理和针对反对称问题的新模型,实验表明该算法在测试数据集上具有更好的性能。