肽串联质谱的自动质量评估

Automatic quality assessment of peptide tandem mass spectra.

作者信息

Bern Marshall, Goldberg David, McDonald W Hayes, Yates John R

机构信息

Palo Alto Research Center, Palo Alto, CA 94304, USA.

出版信息

Bioinformatics. 2004 Aug 4;20 Suppl 1:i49-54. doi: 10.1093/bioinformatics/bth947.

DOI:10.1093/bioinformatics/bth947

PMID:15262780

Abstract

MOTIVATION

A powerful proteomics methodology couples high-performance liquid chromatography (HPLC) with tandem mass spectrometry and database-search software, such as SEQUEST. Such a set-up, however, produces a large number of spectra, many of which are of too poor quality to be useful. Hence a filter that eliminates poor spectra before the database search can significantly improve throughput and robustness. Moreover, spectra judged to be of high quality, but that cannot be identified by database search, are prime candidates for still more computationally intensive methods, such as de novo sequencing or wider database searches including post-translational modifications.

RESULTS

We report on two different approaches to assessing spectral quality prior to identification: binary classification, which predicts whether or not SEQUEST will be able to make an identification, and statistical regression, which predicts a more universal quality metric involving the number of b- and y-ion peaks. The best of our binary classifiers can eliminate over 75% of the unidentifiable spectra while losing only 10% of the identifiable spectra. Statistical regression can pick out spectra of modified peptides that can be identified by a de novo program but not by SEQUEST. In a section of independent interest, we discuss intensity normalization of mass spectra.

摘要

动机

一种强大的蛋白质组学方法将高效液相色谱（HPLC）与串联质谱以及数据库搜索软件（如SEQUEST）相结合。然而，这样的设置会产生大量的谱图，其中许多质量太差而无法使用。因此，在数据库搜索之前消除质量差的谱图的过滤器可以显著提高通量和稳健性。此外，被判定为高质量但无法通过数据库搜索识别的谱图，是更多计算密集型方法（如从头测序或包括翻译后修饰的更广泛数据库搜索）的主要候选对象。

结果

我们报告了在鉴定之前评估谱图质量的两种不同方法：二元分类，它预测SEQUEST是否能够进行鉴定；以及统计回归，它预测一个更通用的质量指标，涉及b离子峰和y离子峰的数量。我们最好的二元分类器可以消除超过75%无法识别的谱图，同时仅损失10%可识别的谱图。统计回归可以挑选出可通过从头程序识别但不能通过SEQUEST识别的修饰肽的谱图。在一个独立感兴趣的部分中，我们讨论了质谱的强度归一化。

相似文献

Automatic quality assessment of peptide tandem mass spectra.

Bioinformatics. 2004 Aug 4;20 Suppl 1:i49-54. doi: 10.1093/bioinformatics/bth947.

Quality classification of tandem mass spectrometry data.

Bioinformatics. 2006 Feb 15;22(4):400-6. doi: 10.1093/bioinformatics/bti829. Epub 2005 Dec 13.

Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations.

Nat Methods. 2005 Sep;2(9):667-75. doi: 10.1038/nmeth785.

De novo peptide sequencing using ion peak intensity and amino acid cleavage intensity ratio.

Bioinformatics. 2007 May 1;23(9):1068-72. doi: 10.1093/bioinformatics/btm062. Epub 2007 Mar 6.

A fast coarse filtering method for peptide identification by mass spectrometry.

Bioinformatics. 2006 Jun 15;22(12):1524-31. doi: 10.1093/bioinformatics/btl118. Epub 2006 Apr 3.

Robust accurate identification of peptides (RAId): deciphering MS2 data using a structured library search with de novo based statistics.

Bioinformatics. 2005 Oct 1;21(19):3726-32. doi: 10.1093/bioinformatics/bti620. Epub 2005 Aug 16.

pNovo: de novo peptide sequencing and identification using HCD spectra.

J Proteome Res. 2010 May 7;9(5):2713-24. doi: 10.1021/pr100182k.

PepSplice: cache-efficient search algorithms for comprehensive identification of tandem mass spectra.

Bioinformatics. 2007 Nov 15;23(22):3016-23. doi: 10.1093/bioinformatics/btm417. Epub 2007 Sep 3.

De novo sequencing methods in proteomics.

Methods Mol Biol. 2010;604:105-21. doi: 10.1007/978-1-60761-444-9_8.

Quality assessment of peptide tandem mass spectra.

BMC Bioinformatics. 2008 May 28;9 Suppl 6(Suppl 6):S13. doi: 10.1186/1471-2105-9-S6-S13.

引用本文的文献

Improving Spectral Similarity and Molecular Network Reliability through Noise Signal Filtering in MS/MS Spectra.

Anal Chem. 2025 Jul 29;97(29):15873-15882. doi: 10.1021/acs.analchem.5c02109. Epub 2025 Jul 17.

A complementary approach for detecting biological signals through a semi-automated feature selection tool.

Front Chem. 2024 Oct 25;12:1477492. doi: 10.3389/fchem.2024.1477492. eCollection 2024.

Predicting the Diagnostic Information of Tandem Mass Spectra of Environmentally Relevant Compounds Using Machine Learning.

Anal Chem. 2023 Oct 24;95(42):15810-15817. doi: 10.1021/acs.analchem.3c03470. Epub 2023 Oct 9.

Deep Learning-based MSMS Spectra Reduction in Support of Running Multiple Protein Search Engines on Cloud.

Proceedings (IEEE Int Conf Bioinformatics Biomed). 2017 Nov;2017:1909-1914. doi: 10.1109/bibm.2017.8217951. Epub 2017 Dec 18.

An Out-of-Core GPU based dimensionality reduction algorithm for Big Mass Spectrometry Data and its application in bottom-up Proteomics.

ACM BCB. 2017 Aug;2017:550-555. doi: 10.1145/3107411.3107466.

Phosphorylation-dependent inhibition of Cdc42 GEF Gef1 by 14-3-3 protein Rad24 spatially regulates Cdc42 GTPase activity and oscillatory dynamics during cell morphogenesis.

Mol Biol Cell. 2015 Oct 1;26(19):3520-34. doi: 10.1091/mbc.E15-02-0095. Epub 2015 Aug 5.

Using collective expert judgements to evaluate quality measures of mass spectrometry images.

Bioinformatics. 2015 Jun 15;31(12):i375-84. doi: 10.1093/bioinformatics/btv266.

Molecular dissection of the interaction between the AMPA receptor and cornichon homolog-3.

J Neurosci. 2014 Sep 3;34(36):12104-20. doi: 10.1523/JNEUROSCI.0595-14.2014.

Proteomic analysis of the Plasmodium male gamete reveals the key role for glycolysis in flagellar motility.

Malar J. 2014 Aug 13;13:315. doi: 10.1186/1475-2875-13-315.

Mechanisms of acute kidney injury induced by experimental Lonomia obliqua envenomation.

Arch Toxicol. 2015 Mar;89(3):459-83. doi: 10.1007/s00204-014-1264-0. Epub 2014 May 6.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

肽串联质谱的自动质量评估

Automatic quality assessment of peptide tandem mass spectra.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

动机

结果

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献