Suppr超能文献

时间问题:通过高效的 SVM 学习实现大规模蛋白质组学的快速渗透分析。

A Matter of Time: Faster Percolator Analysis via Efficient SVM Learning for Large-Scale Proteomics.

机构信息

Department of Public Health Sciences , University of California, Davis , Davis , California 95616 , United States.

Division of Biostatistics , University of California, Davis , Davis , California 95616 , United States.

出版信息

J Proteome Res. 2018 May 4;17(5):1978-1982. doi: 10.1021/acs.jproteome.7b00767. Epub 2018 Apr 6.

Abstract

Percolator is an important tool for greatly improving the results of a database search and subsequent downstream analysis. Using support vector machines (SVMs), Percolator recalibrates peptide-spectrum matches based on the learned decision boundary between targets and decoys. To improve analysis time for large-scale data sets, we update Percolator's SVM learning engine through software and algorithmic optimizations rather than heuristic approaches that necessitate the careful study of their impact on learned parameters across different search settings and data sets. We show that by optimizing Percolator's original learning algorithm, l-SVM-MFN, large-scale SVM learning requires nearly only a third of the original runtime. Furthermore, we show that by employing the widely used Trust Region Newton (TRON) algorithm instead of l-SVM-MFN, large-scale Percolator SVM learning is reduced to nearly only a fifth of the original runtime. Importantly, these speedups only affect the speed at which Percolator converges to a global solution and do not alter recalibration performance. The upgraded versions of both l-SVM-MFN and TRON are optimized within the Percolator codebase for multithreaded and single-thread use and are available under Apache license at bitbucket.org/jthalloran/percolator_upgrade .

摘要

percolator 是一种重要的工具,可以大大提高数据库搜索的结果和后续的下游分析。使用支持向量机 (SVMs), percolator 根据目标和诱饵之间的学习决策边界重新校准肽谱匹配。为了提高大规模数据集的分析时间,我们通过软件和算法优化来更新 percolator 的 SVM 学习引擎,而不是需要仔细研究其对不同搜索设置和数据集的学习参数的影响的启发式方法。我们表明,通过优化 percolator 的原始学习算法 l-SVM-MFN,大规模 SVM 学习几乎只需要原始运行时间的三分之一。此外,我们表明,通过使用广泛使用的信任区域牛顿 (TRON) 算法而不是 l-SVM-MFN,大规模 percolator SVM 学习减少到几乎只有原始运行时间的五分之一。重要的是,这些加速仅影响 percolator 收敛到全局解的速度,而不会改变重新校准性能。对 l-SVM-MFN 和 TRON 的升级版本都在 percolator 代码库中进行了多线程和单线程优化,并在 bitbucket.org/jthalloran/percolator_upgrade 下以 Apache 许可证提供。

相似文献

2
Speeding Up Percolator.加快渗滤器。
J Proteome Res. 2019 Sep 6;18(9):3353-3359. doi: 10.1021/acs.jproteome.9b00288. Epub 2019 Aug 23.

引用本文的文献

3
Bioinformatics Pipeline for Processing Single-Cell Data.单细胞数据分析的生物信息学流程。
Methods Mol Biol. 2024;2817:221-239. doi: 10.1007/978-1-0716-3934-4_15.
8
Speeding Up Percolator.加快渗滤器。
J Proteome Res. 2019 Sep 6;18(9):3353-3359. doi: 10.1021/acs.jproteome.9b00288. Epub 2019 Aug 23.

本文引用的文献

4
6
Crux: rapid open source protein tandem mass spectrometry analysis.关键:快速开源蛋白质串联质谱分析
J Proteome Res. 2014 Oct 3;13(10):4488-91. doi: 10.1021/pr500741y. Epub 2014 Sep 9.
7
A draft map of the human proteome.人类蛋白质组草图。
Nature. 2014 May 29;509(7502):575-81. doi: 10.1038/nature13302.
8
Fast and accurate database searches with MS-GF+Percolator.使用MS-GF+Percolator进行快速准确的数据库搜索。
J Proteome Res. 2014 Feb 7;13(2):890-7. doi: 10.1021/pr400937n. Epub 2013 Dec 23.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验