• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

时间问题:通过高效的 SVM 学习实现大规模蛋白质组学的快速渗透分析。

A Matter of Time: Faster Percolator Analysis via Efficient SVM Learning for Large-Scale Proteomics.

机构信息

Department of Public Health Sciences , University of California, Davis , Davis , California 95616 , United States.

Division of Biostatistics , University of California, Davis , Davis , California 95616 , United States.

出版信息

J Proteome Res. 2018 May 4;17(5):1978-1982. doi: 10.1021/acs.jproteome.7b00767. Epub 2018 Apr 6.

DOI:10.1021/acs.jproteome.7b00767
PMID:29607643
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6420878/
Abstract

Percolator is an important tool for greatly improving the results of a database search and subsequent downstream analysis. Using support vector machines (SVMs), Percolator recalibrates peptide-spectrum matches based on the learned decision boundary between targets and decoys. To improve analysis time for large-scale data sets, we update Percolator's SVM learning engine through software and algorithmic optimizations rather than heuristic approaches that necessitate the careful study of their impact on learned parameters across different search settings and data sets. We show that by optimizing Percolator's original learning algorithm, l-SVM-MFN, large-scale SVM learning requires nearly only a third of the original runtime. Furthermore, we show that by employing the widely used Trust Region Newton (TRON) algorithm instead of l-SVM-MFN, large-scale Percolator SVM learning is reduced to nearly only a fifth of the original runtime. Importantly, these speedups only affect the speed at which Percolator converges to a global solution and do not alter recalibration performance. The upgraded versions of both l-SVM-MFN and TRON are optimized within the Percolator codebase for multithreaded and single-thread use and are available under Apache license at bitbucket.org/jthalloran/percolator_upgrade .

摘要

percolator 是一种重要的工具,可以大大提高数据库搜索的结果和后续的下游分析。使用支持向量机 (SVMs), percolator 根据目标和诱饵之间的学习决策边界重新校准肽谱匹配。为了提高大规模数据集的分析时间,我们通过软件和算法优化来更新 percolator 的 SVM 学习引擎,而不是需要仔细研究其对不同搜索设置和数据集的学习参数的影响的启发式方法。我们表明,通过优化 percolator 的原始学习算法 l-SVM-MFN,大规模 SVM 学习几乎只需要原始运行时间的三分之一。此外,我们表明,通过使用广泛使用的信任区域牛顿 (TRON) 算法而不是 l-SVM-MFN,大规模 percolator SVM 学习减少到几乎只有原始运行时间的五分之一。重要的是,这些加速仅影响 percolator 收敛到全局解的速度,而不会改变重新校准性能。对 l-SVM-MFN 和 TRON 的升级版本都在 percolator 代码库中进行了多线程和单线程优化,并在 bitbucket.org/jthalloran/percolator_upgrade 下以 Apache 许可证提供。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19e8/6420878/c454c1f831af/nihms-1014248-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19e8/6420878/19cf4e36c35b/nihms-1014248-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19e8/6420878/e8e8e200e810/nihms-1014248-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19e8/6420878/ddf43f25555e/nihms-1014248-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19e8/6420878/c454c1f831af/nihms-1014248-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19e8/6420878/19cf4e36c35b/nihms-1014248-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19e8/6420878/e8e8e200e810/nihms-1014248-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19e8/6420878/ddf43f25555e/nihms-1014248-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19e8/6420878/c454c1f831af/nihms-1014248-f0005.jpg

相似文献

1
A Matter of Time: Faster Percolator Analysis via Efficient SVM Learning for Large-Scale Proteomics.时间问题:通过高效的 SVM 学习实现大规模蛋白质组学的快速渗透分析。
J Proteome Res. 2018 May 4;17(5):1978-1982. doi: 10.1021/acs.jproteome.7b00767. Epub 2018 Apr 6.
2
Speeding Up Percolator.加快渗滤器。
J Proteome Res. 2019 Sep 6;18(9):3353-3359. doi: 10.1021/acs.jproteome.9b00288. Epub 2019 Aug 23.
3
Machine Learning Strategy That Leverages Large Data sets to Boost Statistical Power in Small-Scale Experiments.利用大数据集提高小规模实验统计功效的机器学习策略。
J Proteome Res. 2020 Mar 6;19(3):1267-1274. doi: 10.1021/acs.jproteome.9b00780. Epub 2020 Feb 17.
4
Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0.使用 percolator 3.0 对大规模蛋白质组学数据集进行快速准确的蛋白质假发现率估计。
J Am Soc Mass Spectrom. 2016 Nov;27(11):1719-1727. doi: 10.1007/s13361-016-1460-7. Epub 2016 Aug 29.
5
Improvements to the percolator algorithm for Peptide identification from shotgun proteomics data sets.对用于从鸟枪法蛋白质组学数据集中鉴定肽段的渗滤器算法的改进。
J Proteome Res. 2009 Jul;8(7):3737-45. doi: 10.1021/pr801109k.
6
Enhanced peptide identification by electron transfer dissociation using an improved Mascot Percolator.采用改进的 Mascot Percolator 进行电子转移解离增强肽鉴定。
Mol Cell Proteomics. 2012 Aug;11(8):478-91. doi: 10.1074/mcp.O111.014522. Epub 2012 Apr 6.
7
Scavager: A Versatile Postsearch Validation Algorithm for Shotgun Proteomics Based on Gradient Boosting. scavager:基于梯度提升的 shotgun 蛋白质组学的通用后搜索验证算法。
Proteomics. 2019 Feb;19(3):e1800280. doi: 10.1002/pmic.201800280. Epub 2018 Dec 27.
8
Improving X!Tandem on peptide identification from mass spectrometry by self-boosted Percolator.通过自增强 percolator 提高 X!串联在质谱肽鉴定中的性能。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Sep-Oct;9(5):1273-80. doi: 10.1109/TCBB.2012.86.
9
Optimization of Search Engines and Postprocessing Approaches to Maximize Peptide and Protein Identification for High-Resolution Mass Data.优化搜索引擎和后处理方法以最大化高分辨率质谱数据的肽段和蛋白质鉴定
J Proteome Res. 2015 Nov 6;14(11):4662-73. doi: 10.1021/acs.jproteome.5b00536. Epub 2015 Sep 30.
10
Sensitive and Specific Spectral Library Searching with CompOmics Spectral Library Searching Tool and Percolator.使用 CompOmics 光谱库检索工具和 percolator 进行敏感和特异的光谱库检索。
J Proteome Res. 2022 May 6;21(5):1365-1370. doi: 10.1021/acs.jproteome.2c00075. Epub 2022 Apr 21.

引用本文的文献

1
Alternate RNA decoding results in stable and abundant proteins in mammals.交替RNA解码可在哺乳动物中产生稳定且丰富的蛋白质。
bioRxiv. 2024 Oct 2:2024.08.26.609665. doi: 10.1101/2024.08.26.609665.
2
Expression patterns and clinical value of key m6A RNA modification regulators in smoking patients with coronary artery disease.关键 m6A RNA 修饰调控因子在吸烟合并冠心病患者中的表达模式及临床价值。
Epigenetics. 2024 Dec;19(1):2392400. doi: 10.1080/15592294.2024.2392400. Epub 2024 Aug 21.
3
Bioinformatics Pipeline for Processing Single-Cell Data.

本文引用的文献

1
Gradients of Generative Models for Improved Discriminative Analysis of Tandem Mass Spectra.用于改进串联质谱鉴别分析的生成模型梯度
Adv Neural Inf Process Syst. 2017 Dec;30:5724-5733.
2
Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0.使用 percolator 3.0 对大规模蛋白质组学数据集进行快速准确的蛋白质假发现率估计。
J Am Soc Mass Spectrom. 2016 Nov;27(11):1719-1727. doi: 10.1007/s13361-016-1460-7. Epub 2016 Aug 29.
3
Dynamic Bayesian Network for Accurate Detection of Peptides from Tandem Mass Spectra.
单细胞数据分析的生物信息学流程。
Methods Mol Biol. 2024;2817:221-239. doi: 10.1007/978-1-0716-3934-4_15.
4
Identification and validation of a novel signature as a diagnostic and prognostic biomarker in colorectal cancer.鉴定和验证一种新型标志物作为结直肠癌的诊断和预后生物标志物。
Biol Direct. 2022 Nov 2;17(1):29. doi: 10.1186/s13062-022-00342-w.
5
Diagnosis, clustering, and immune cell infiltration analysis of m6A-related genes in patients with acute myocardial infarction-a bioinformatics analysis.急性心肌梗死患者中m6A相关基因的诊断、聚类及免疫细胞浸润分析——一项生物信息学分析
J Thorac Dis. 2022 May;14(5):1607-1619. doi: 10.21037/jtd-22-569.
6
Ion Mobility Coupled to a Time-of-Flight Mass Analyzer Combined With Fragment Intensity Predictions Improves Identification of Classical Bioactive Peptides and Small Open Reading Frame-Encoded Peptides.与飞行时间质量分析仪耦合的离子淌度结合片段强度预测可改善经典生物活性肽和小开放阅读框编码肽的鉴定。
Front Cell Dev Biol. 2021 Sep 17;9:720570. doi: 10.3389/fcell.2021.720570. eCollection 2021.
7
A cost-sensitive online learning method for peptide identification.一种基于代价敏感的在线学习方法用于肽段鉴定。
BMC Genomics. 2020 Apr 25;21(1):324. doi: 10.1186/s12864-020-6693-y.
8
Speeding Up Percolator.加快渗滤器。
J Proteome Res. 2019 Sep 6;18(9):3353-3359. doi: 10.1021/acs.jproteome.9b00288. Epub 2019 Aug 23.
用于从串联质谱中准确检测肽段的动态贝叶斯网络
J Proteome Res. 2016 Aug 5;15(8):2749-59. doi: 10.1021/acs.jproteome.6b00290. Epub 2016 Jul 22.
4
Improved False Discovery Rate Estimation Procedure for Shotgun Proteomics.用于鸟枪法蛋白质组学的改进型错误发现率估计程序
J Proteome Res. 2015 Aug 7;14(8):3148-61. doi: 10.1021/acs.jproteome.5b00081. Epub 2015 Jul 27.
5
MS-GF+ makes progress towards a universal database search tool for proteomics.MS-GF+朝着蛋白质组学通用数据库搜索工具的方向取得了进展。
Nat Commun. 2014 Oct 31;5:5277. doi: 10.1038/ncomms6277.
6
Crux: rapid open source protein tandem mass spectrometry analysis.关键:快速开源蛋白质串联质谱分析
J Proteome Res. 2014 Oct 3;13(10):4488-91. doi: 10.1021/pr500741y. Epub 2014 Sep 9.
7
A draft map of the human proteome.人类蛋白质组草图。
Nature. 2014 May 29;509(7502):575-81. doi: 10.1038/nature13302.
8
Fast and accurate database searches with MS-GF+Percolator.使用MS-GF+Percolator进行快速准确的数据库搜索。
J Proteome Res. 2014 Feb 7;13(2):890-7. doi: 10.1021/pr400937n. Epub 2013 Dec 23.
9
An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.一种将肽的串联质谱数据与蛋白质数据库中氨基酸序列相关联的方法。
J Am Soc Mass Spectrom. 1994 Nov;5(11):976-89. doi: 10.1016/1044-0305(94)80016-2.
10
Global analysis of protein expression and phosphorylation of three stages of Plasmodium falciparum intraerythrocytic development.疟原虫红内期发育三个阶段的蛋白质表达和磷酸化的全局分析。
J Proteome Res. 2013 Sep 6;12(9):4028-45. doi: 10.1021/pr400394g. Epub 2013 Aug 26.