Suppr超能文献

用于从串联质谱中准确检测肽段的动态贝叶斯网络

Dynamic Bayesian Network for Accurate Detection of Peptides from Tandem Mass Spectra.

作者信息

Halloran John T, Bilmes Jeff A, Noble William S

机构信息

Department of Electrical Engineering, University of Washington , Seattle 98195, Washington, United States.

Department of Genome Sciences, University of Washington , Seattle 98195, Washington, United States.

出版信息

J Proteome Res. 2016 Aug 5;15(8):2749-59. doi: 10.1021/acs.jproteome.6b00290. Epub 2016 Jul 22.

Abstract

A central problem in mass spectrometry analysis involves identifying, for each observed tandem mass spectrum, the corresponding generating peptide. We present a dynamic Bayesian network (DBN) toolkit that addresses this problem by using a machine learning approach. At the heart of this toolkit is a DBN for Rapid Identification (DRIP), which can be trained from collections of high-confidence peptide-spectrum matches (PSMs). DRIP's score function considers fragment ion matches using Gaussians rather than fixed fragment-ion tolerances and also finds the optimal alignment between the theoretical and observed spectrum by considering all possible alignments, up to a threshold that is controlled using a beam-pruning algorithm. This function not only yields state-of-the art database search accuracy but also can be used to generate features that significantly boost the performance of the Percolator postprocessor. The DRIP software is built upon a general purpose DBN toolkit (GMTK), thereby allowing a wide variety of options for user-specific inference tasks as well as facilitating easy modifications to the DRIP model in future work. DRIP is implemented in Python and C++ and is available under Apache license at http://melodi-lab.github.io/dripToolkit .

摘要

质谱分析中的一个核心问题是,对于每个观察到的串联质谱,识别出相应的生成肽段。我们提出了一种动态贝叶斯网络(DBN)工具包,通过机器学习方法解决这一问题。该工具包的核心是一个用于快速识别的DBN(DRIP),它可以从高置信度的肽段-谱匹配(PSM)集合中进行训练。DRIP的评分函数使用高斯分布来考虑碎片离子匹配,而不是固定的碎片离子容差,并且通过考虑所有可能的比对,直至使用束剪枝算法控制的阈值,来找到理论谱和观察谱之间的最优比对。该函数不仅能产生一流的数据库搜索准确性,还可用于生成显著提升Percolator后处理器性能的特征。DRIP软件基于一个通用的DBN工具包(GMTK)构建,从而为用户特定的推理任务提供了多种选项,并便于在未来工作中对DRIP模型进行轻松修改。DRIP用Python和C++实现,可在http://melodi-lab.github.io/dripToolkit上根据Apache许可获取。

相似文献

2
Analyzing Tandem Mass Spectra Using the DRIP Toolkit: Training, Searching, and Post-Processing.
Methods Mol Biol. 2018;1807:163-180. doi: 10.1007/978-1-4939-8561-6_12.
5
Speeding Up Percolator.加快渗滤器。
J Proteome Res. 2019 Sep 6;18(9):3353-3359. doi: 10.1021/acs.jproteome.9b00288. Epub 2019 Aug 23.

本文引用的文献

3
Crux: rapid open source protein tandem mass spectrometry analysis.关键:快速开源蛋白质串联质谱分析
J Proteome Res. 2014 Oct 3;13(10):4488-91. doi: 10.1021/pr500741y. Epub 2014 Sep 9.
5
Fast and accurate database searches with MS-GF+Percolator.使用MS-GF+Percolator进行快速准确的数据库搜索。
J Proteome Res. 2014 Feb 7;13(2):890-7. doi: 10.1021/pr400937n. Epub 2013 Dec 23.
7
Variation and genetic control of protein abundance in humans.人类蛋白质丰度的变化和遗传控制。
Nature. 2013 Jul 4;499(7456):79-82. doi: 10.1038/nature12223. Epub 2013 May 15.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验