MSMS评估：用于高通量蛋白质组学的串联质谱质量分配

msmsEval: tandem mass spectral quality assignment for high-throughput proteomics.

作者信息

Wong Jason W H, Sullivan Matthew J, Cartwright Hugh M, Cagney Gerard

机构信息

Chemistry Department, Oxford University, Physical and Theoretical Chemistry Laboratory, South Parks Road, Oxford OX1 3QZ, UK.

出版信息

BMC Bioinformatics. 2007 Feb 9;8:51. doi: 10.1186/1471-2105-8-51.

DOI:10.1186/1471-2105-8-51

PMID:17291342

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1803797/

Abstract

BACKGROUND

In proteomics experiments, database-search programs are the method of choice for protein identification from tandem mass spectra. As amino acid sequence databases grow however, computing resources required for these programs have become prohibitive, particularly in searches for modified proteins. Recently, methods to limit the number of spectra to be searched based on spectral quality have been proposed by different research groups, but rankings of spectral quality have thus far been based on arbitrary cut-off values. In this work, we develop a more readily interpretable spectral quality statistic by providing probability values for the likelihood that spectra will be identifiable.

RESULTS

We describe an application, msmsEval, that builds on previous work by statistically modeling the spectral quality discriminant function using a Gaussian mixture model. This allows a researcher to filter spectra based on the probability that a spectrum will ultimately be identified by database searching. We show that spectra that are predicted by msmsEval to be of high quality, yet remain unidentified in standard database searches, are candidates for more intensive search strategies. Using a well studied public dataset we also show that a high proportion (83.9%) of the spectra predicted by msmsEval to be of high quality but that elude standard search strategies, are in fact interpretable.

CONCLUSION

msmsEval will be useful for high-throughput proteomics projects and is freely available for download from http://proteomics.ucd.ie/msmseval. Supports Windows, Mac OS X and Linux/Unix operating systems.

摘要

背景

在蛋白质组学实验中，数据库搜索程序是从串联质谱中鉴定蛋白质的首选方法。然而，随着氨基酸序列数据库的不断增长，这些程序所需的计算资源变得令人望而却步，尤其是在搜索修饰蛋白质时。最近，不同的研究小组提出了基于谱图质量来限制待搜索谱图数量的方法，但迄今为止，谱图质量的排名是基于任意的截止值。在这项工作中，我们通过为谱图可识别的可能性提供概率值，开发了一种更易于解释的谱图质量统计方法。

结果

我们描述了一个应用程序msmsEval，它基于之前的工作，使用高斯混合模型对谱图质量判别函数进行统计建模。这使得研究人员能够根据谱图最终通过数据库搜索被识别的概率来过滤谱图。我们表明，msmsEval预测为高质量但在标准数据库搜索中仍未被识别的谱图，是更密集搜索策略的候选对象。使用一个经过充分研究的公共数据集，我们还表明，msmsEval预测为高质量但无法通过标准搜索策略识别的谱图中，有很大比例（83.9%）实际上是可以解释的。

结论

msmsEval将对高通量蛋白质组学项目有用，可从http://proteomics.ucd.ie/msmseval免费下载。支持Windows、Mac OS X和Linux/Unix操作系统。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/1803797/4cbd503e099d/1471-2105-8-51-1.jpg

相似文献

msmsEval: tandem mass spectral quality assignment for high-throughput proteomics.MSMS评估：用于高通量蛋白质组学的串联质谱质量分配

BMC Bioinformatics. 2007 Feb 9;8:51. doi: 10.1186/1471-2105-8-51.

Automatic quality assessment of peptide tandem mass spectra.肽串联质谱的自动质量评估

Bioinformatics. 2004 Aug 4;20 Suppl 1:i49-54. doi: 10.1093/bioinformatics/bth947.

Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations.大规模蛋白质组学研究中使用的质谱平台的比较评估。

Nat Methods. 2005 Sep;2(9):667-75. doi: 10.1038/nmeth785.

Valid data from large-scale proteomics studies.来自大规模蛋白质组学研究的有效数据。

Nat Methods. 2005 Sep;2(9):647-8. doi: 10.1038/nmeth0905-647.

Quality assessment of peptide tandem mass spectra.肽串联质谱的质量评估

BMC Bioinformatics. 2008 May 28;9 Suppl 6(Suppl 6):S13. doi: 10.1186/1471-2105-9-S6-S13.

Installation and use of the Computational Proteomics Analysis System (CPAS).计算蛋白质组学分析系统（CPAS）的安装与使用。

Curr Protoc Bioinformatics. 2007 Jun;Chapter 13:Unit 13.5. doi: 10.1002/0471250953.bi1305s18.

Optimization of filtering criterion for SEQUEST database searching to improve proteome coverage in shotgun proteomics.优化用于SEQUEST数据库搜索的过滤标准以提高鸟枪法蛋白质组学中的蛋白质组覆盖率。

BMC Bioinformatics. 2007 Aug 31;8:323. doi: 10.1186/1471-2105-8-323.

ProteinInferencer: Confident protein identification and multiple experiment comparison for large scale proteomics projects.蛋白质推理器：用于大规模蛋白质组学项目的可靠蛋白质鉴定和多实验比较

J Proteomics. 2015 Nov 3;129:25-32. doi: 10.1016/j.jprot.2015.07.006. Epub 2015 Jul 18.

ProMEX: a mass spectral reference database for proteins and protein phosphorylation sites.ProMEX：一个用于蛋白质和蛋白质磷酸化位点的质谱参考数据库。

BMC Bioinformatics. 2007 Jun 23;8:216. doi: 10.1186/1471-2105-8-216.

Using cross-correlation normalized for peptide length to optimize peptide identification in shotgun proteomics.使用针对肽段长度进行归一化的互相关来优化鸟枪法蛋白质组学中的肽段鉴定。

Rapid Commun Mass Spectrom. 2005;19(20):2983-5. doi: 10.1002/rcm.2137.

引用本文的文献

Persistent Antibody Clonotypes Dominate the Serum Response to Influenza over Multiple Years and Repeated Vaccinations.多年多次接种流感疫苗后，血清反应中持续存在的抗体克隆型占主导地位。

Cell Host Microbe. 2019 Mar 13;25(3):367-376.e5. doi: 10.1016/j.chom.2019.01.010. Epub 2019 Feb 19.

Filtering of MS/MS data for peptide identification.用于肽段鉴定的MS/MS数据过滤

BMC Genomics. 2013;14 Suppl 7(Suppl 7):S2. doi: 10.1186/1471-2164-14-S7-S2. Epub 2013 Nov 5.

A tool to evaluate correspondence between extraction ion chromatographic peaks and peptide-spectrum matches in shotgun proteomics experiments.一种用于评估 shotgun 蛋白质组学实验中提取离子色谱峰与肽谱匹配之间一致性的工具。

Proteomics. 2013 Aug;13(16):2386-97. doi: 10.1002/pmic.201300022. Epub 2013 Jul 11.

Mass spectrometry-based protein identification by integrating de novo sequencing with database searching.基于质谱的蛋白质鉴定：从头测序与数据库搜索的整合。

BMC Bioinformatics. 2013;14 Suppl 2(Suppl 2):S24. doi: 10.1186/1471-2105-14-S2-S24. Epub 2013 Jan 21.

Quality assessment for clinical proteomics.临床蛋白质组学的质量评估。

Clin Biochem. 2013 Apr;46(6):411-20. doi: 10.1016/j.clinbiochem.2012.12.003. Epub 2012 Dec 12.

An unsupervised machine learning method for assessing quality of tandem mass spectra.一种用于评估串联质谱质量的无监督机器学习方法。

Proteome Sci. 2012 Jun 21;10 Suppl 1(Suppl 1):S12. doi: 10.1186/1477-5956-10-S1-S12.

Features-based deisotoping method for tandem mass spectra.基于特征的串联质谱去同位素方法。

Adv Bioinformatics. 2011;2011:210805. doi: 10.1155/2011/210805. Epub 2012 Jan 4.

Applications of graph theory in protein structure identification.图论在蛋白质结构鉴定中的应用。

Proteome Sci. 2011 Oct 14;9 Suppl 1(Suppl 1):S17. doi: 10.1186/1477-5956-9-S1-S17.

Spectral archives: extending spectral libraries to analyze both identified and unidentified spectra.光谱档案：扩展光谱库以分析已识别和未识别的光谱。

Nat Methods. 2011 May 15;8(7):587-91. doi: 10.1038/nmeth.1609.

ScanRanker: Quality assessment of tandem mass spectra via sequence tagging.ScanRanker：通过序列标记对串联质谱进行质量评估。

J Proteome Res. 2011 Jul 1;10(7):2896-904. doi: 10.1021/pr200118r. Epub 2011 Apr 26.

本文引用的文献

An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.一种将肽的串联质谱数据与蛋白质数据库中氨基酸序列相关联的方法。

J Am Soc Mass Spectrom. 1994 Nov;5(11):976-89. doi: 10.1016/1044-0305(94)80016-2.

Improving the reliability and throughput of mass spectrometry-based proteomics by spectrum quality filtering.通过谱图质量过滤提高基于质谱的蛋白质组学的可靠性和通量。

Proteomics. 2006 Apr;6(7):2086-94. doi: 10.1002/pmic.200500309.

Quality classification of tandem mass spectrometry data.串联质谱数据的质量分类

Bioinformatics. 2006 Feb 15;22(4):400-6. doi: 10.1093/bioinformatics/bti829. Epub 2005 Dec 13.

Dynamic spectrum quality assessment and iterative computational analysis of shotgun proteomic data: toward more efficient identification of post-translational modifications, sequence polymorphisms, and novel peptides.鸟枪法蛋白质组学数据的动态谱质量评估与迭代计算分析：迈向更高效地鉴定翻译后修饰、序列多态性和新型肽段

Mol Cell Proteomics. 2006 Apr;5(4):652-70. doi: 10.1074/mcp.M500319-MCP200. Epub 2005 Dec 12.

Identification of post-translational modifications by blind search of mass spectra.通过对质谱进行盲目搜索来鉴定翻译后修饰。

Nat Biotechnol. 2005 Dec;23(12):1562-7. doi: 10.1038/nbt1168. Epub 2005 Nov 27.

Human tissue profiling with multidimensional protein identification technology.利用多维蛋白质识别技术进行人体组织分析

J Proteome Res. 2005 Sep-Oct;4(5):1757-67. doi: 10.1021/pr0500354.

SPIDER: software for protein identification from sequence tags with de novo sequencing error.SPIDER：用于从带有从头测序错误的序列标签中鉴定蛋白质的软件。

J Bioinform Comput Biol. 2005 Jun;3(3):697-716. doi: 10.1142/s0219720005001247.

InsPecT: identification of posttranslationally modified peptides from tandem mass spectra.InsPecT：从串联质谱中鉴定翻译后修饰的肽段。

Anal Chem. 2005 Jul 15;77(14):4626-39. doi: 10.1021/ac050102d.

PepNovo: de novo peptide sequencing via probabilistic network modeling.PepNovo：通过概率网络建模进行肽段从头测序。

Anal Chem. 2005 Feb 15;77(4):964-73. doi: 10.1021/ac048788h.

Assessing data quality of peptide mass spectra obtained by quadrupole ion trap mass spectrometry.评估通过四极杆离子阱质谱法获得的肽质谱数据质量。

J Proteome Res. 2005 Mar-Apr;4(2):300-5. doi: 10.1021/pr049844y.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

MSMS评估：用于高通量蛋白质组学的串联质谱质量分配

msmsEval: tandem mass spectral quality assignment for high-throughput proteomics.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献