通过自增强 percolator 提高 X！串联在质谱肽鉴定中的性能。

Improving X!Tandem on peptide identification from mass spectrometry by self-boosted Percolator.

机构信息

School of Information Technologies, University of Sydney, NSW 2006, Australia.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2012 Sep-Oct;9(5):1273-80. doi: 10.1109/TCBB.2012.86.

DOI:10.1109/TCBB.2012.86

Abstract

A critical component in mass spectrometry (MS)-based proteomics is an accurate protein identification procedure. Database search algorithms commonly generate a list of peptide-spectrum matches (PSMs). The validity of these PSMs is critical for downstream analysis since proteins that are present in the sample are inferred from those PSMs. A variety of postprocessing algorithms have been proposed to validate and filter PSMs. Among them, the most popular ones include a semi-supervised learning (SSL) approach known as Percolator and an empirical modeling approach known as PeptideProphet. However, they are predominantly designed for commercial database search algorithms, i.e., SEQUEST and MASCOT. Therefore, it is highly desirable to extend and optimize those PSM postprocessing algorithms for open source database search algorithms such as X!Tandem. In this paper, we propose a Self-boosted Percolator for postprocessing X!Tandem search results. We find that the SSL algorithm utilized by Percolator depends heavily on the initial ranking of PSMs. Starting with a poor PSM ranking list may cause Percolator to perform suboptimally. By implementing Percolator in a cascade learning manner, we can progressively improve the performance through multiple boost runs, enabling many more PSM identifications without sacrificing false discovery rate (FDR).

摘要

在基于质谱（MS）的蛋白质组学中，一个关键的组成部分是准确的蛋白质鉴定程序。数据库搜索算法通常会生成肽谱匹配（PSM）的列表。这些 PSM 的有效性对于下游分析至关重要，因为样品中存在的蛋白质是从这些 PSM 推断出来的。已经提出了各种后处理算法来验证和过滤 PSM。其中，最流行的方法包括称为 Percolator 的半监督学习（SSL）方法和称为 PeptideProphet 的经验建模方法。然而，它们主要是为商业数据库搜索算法，即 SEQUEST 和 MASCOT 设计的。因此，非常希望将这些 PSM 后处理算法扩展和优化为开源数据库搜索算法，如 X!Tandem。在本文中，我们提出了一种用于处理 X!Tandem 搜索结果的自增强 percolator。我们发现，percolator 所使用的 SSL 算法严重依赖于 PSM 的初始排序。从一个较差的 PSM 排序列表开始可能会导致 percolator 表现不佳。通过以级联学习的方式实现 percolator，我们可以通过多次提升运行来逐步提高性能，从而在不牺牲假发现率（FDR）的情况下识别出更多的 PSM。

相似文献

Improving X!Tandem on peptide identification from mass spectrometry by self-boosted Percolator.通过自增强 percolator 提高 X！串联在质谱肽鉴定中的性能。

IEEE/ACM Trans Comput Biol Bioinform. 2012 Sep-Oct;9(5):1273-80. doi: 10.1109/TCBB.2012.86.

Optimization of Search Engines and Postprocessing Approaches to Maximize Peptide and Protein Identification for High-Resolution Mass Data.优化搜索引擎和后处理方法以最大化高分辨率质谱数据的肽段和蛋白质鉴定

J Proteome Res. 2015 Nov 6;14(11):4662-73. doi: 10.1021/acs.jproteome.5b00536. Epub 2015 Sep 30.

Enhanced peptide identification by electron transfer dissociation using an improved Mascot Percolator.采用改进的 Mascot Percolator 进行电子转移解离增强肽鉴定。

Mol Cell Proteomics. 2012 Aug;11(8):478-91. doi: 10.1074/mcp.O111.014522. Epub 2012 Apr 6.

Combining percolator with X!Tandem for accurate and sensitive peptide identification.联用 percolator 和 X!Tandem 提高肽段鉴定的准确性和灵敏度。

J Proteome Res. 2013 Jun 7;12(6):3026-33. doi: 10.1021/pr4001256. Epub 2013 May 1.

l2 Multiple Kernel Fuzzy SVM-Based Data Fusion for Improving Peptide Identification.基于多核模糊支持向量机的数据融合用于改进肽段鉴定

IEEE/ACM Trans Comput Biol Bioinform. 2016 Jul-Aug;13(4):804-9. doi: 10.1109/TCBB.2015.2480084. Epub 2015 Sep 18.

Empirical multidimensional space for scoring peptide spectrum matches in shotgun proteomics.鸟枪法蛋白质组学中用于对肽谱匹配进行评分的经验多维空间。

J Proteome Res. 2014 Apr 4;13(4):1911-20. doi: 10.1021/pr401026y. Epub 2014 Mar 13.

AttnPep: A Self-Attention-Based Deep Learning Method for Peptide Identification in Shotgun Proteomics.AttnPep：一种基于自注意力的深度学习方法，用于在鸟枪法蛋白质组学中鉴定肽段。

J Proteome Res. 2024 Feb 2;23(2):834-843. doi: 10.1021/acs.jproteome.3c00729. Epub 2024 Jan 22.

Improvements to the percolator algorithm for Peptide identification from shotgun proteomics data sets.对用于从鸟枪法蛋白质组学数据集中鉴定肽段的渗滤器算法的改进。

J Proteome Res. 2009 Jul;8(7):3737-45. doi: 10.1021/pr801109k.

RT-PSM, a real-time program for peptide-spectrum matching with statistical significance.RT-PSM，一种用于肽谱匹配且具有统计学显著性的实时程序。

Rapid Commun Mass Spectrom. 2006;20(8):1199-208. doi: 10.1002/rcm.2435.

Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0.使用 percolator 3.0 对大规模蛋白质组学数据集进行快速准确的蛋白质假发现率估计。

J Am Soc Mass Spectrom. 2016 Nov;27(11):1719-1727. doi: 10.1007/s13361-016-1460-7. Epub 2016 Aug 29.

引用本文的文献

TIDD: tool-independent and data-dependent machine learning for peptide identification.TIDD：用于肽鉴定的与工具无关且与数据相关的机器学习。

BMC Bioinformatics. 2022 Mar 30;23(1):109. doi: 10.1186/s12859-022-04640-y.

A Systematic Evaluation of Semispecific Peptide Search Parameter Enables Identification of Previously Undescribed N-Terminal Peptides and Conserved Proteolytic Processing in Cancer Cell Lines.半特异性肽搜索参数的系统评估有助于鉴定癌细胞系中先前未描述的N端肽和保守的蛋白水解加工过程。

Proteomes. 2021 May 25;9(2):26. doi: 10.3390/proteomes9020026.

Quality control of imbalanced mass spectra from isotopic labeling experiments.同位素标记实验中不平衡质谱的质量控制。

BMC Bioinformatics. 2019 Nov 6;20(1):549. doi: 10.1186/s12859-019-3170-1.

Speeding Up Percolator.加快渗滤器。

J Proteome Res. 2019 Sep 6;18(9):3353-3359. doi: 10.1021/acs.jproteome.9b00288. Epub 2019 Aug 23.

Evaluating the effect of database inflation in proteogenomic search on sensitive and reliable peptide identification.评估蛋白质基因组搜索中数据库膨胀对灵敏且可靠的肽段鉴定的影响。

BMC Genomics. 2016 Dec 22;17(Suppl 13):1031. doi: 10.1186/s12864-016-3327-5.

A Multi-network Approach Identifies Protein-Specific Co-expression in Asymptomatic and Symptomatic Alzheimer's Disease.多网络方法鉴定无症状和有症状阿尔茨海默病中的蛋白特异性共表达。

Cell Syst. 2017 Jan 25;4(1):60-72.e4. doi: 10.1016/j.cels.2016.11.006. Epub 2016 Dec 15.

Confident and sensitive phosphoproteomics using combinations of collision induced dissociation and electron transfer dissociation.结合碰撞诱导解离和电子转移解离的自信且灵敏的磷酸化蛋白质组学

J Proteomics. 2014 May 30;103(100):1-14. doi: 10.1016/j.jprot.2014.03.010. Epub 2014 Mar 21.

Fast and accurate database searches with MS-GF+Percolator.使用MS-GF+Percolator进行快速准确的数据库搜索。

J Proteome Res. 2014 Feb 7;13(2):890-7. doi: 10.1021/pr400937n. Epub 2013 Dec 23.

Open source libraries and frameworks for mass spectrometry based proteomics: a developer's perspective.基于质谱的蛋白质组学的开源库和框架：开发者视角

Biochim Biophys Acta. 2014 Jan;1844(1 Pt A):63-76. doi: 10.1016/j.bbapap.2013.02.032. Epub 2013 Mar 1.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过自增强 percolator 提高 X！串联在质谱肽鉴定中的性能。

Improving X!Tandem on peptide identification from mass spectrometry by self-boosted Percolator.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献