Ryu Soyoung, Goodlett David R, Noble William S, Minin Vladimir N
Department of Statistics, University of Washington, Seattle, WA, USA,
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2012 Oct 4:648-653. doi: 10.1109/BIBMW.2012.6470214.
Tandem mass spectrometry experiments generate from thousands to millions of spectra. These spectra can be used to identify the presence of proteins in biological samples. In this work, we propose a new method to identify peptides, substrings of proteins, based on clustered tandem mass spectrometry data. In contrast to previously proposed approaches, which identify one representative spectrum for each cluster using traditional database searching algorithms, our method uses all available information to score all the spectra in a cluster against candidate peptides using Bayesian model selection. We illustrate the performance of our method by applying it to seven-standard-protein mixture data.
串联质谱实验会产生数千到数百万个光谱。这些光谱可用于识别生物样品中蛋白质的存在。在这项工作中,我们提出了一种基于聚类串联质谱数据来识别肽段(蛋白质的子串)的新方法。与先前提出的方法不同,那些方法使用传统数据库搜索算法为每个聚类识别一个代表性光谱,而我们的方法利用所有可用信息,使用贝叶斯模型选择针对候选肽段对聚类中的所有光谱进行评分。我们通过将该方法应用于七标准蛋白质混合物数据来说明其性能。