使用蛋白质标准混合物通过质谱法检测和推定鉴定肽段的共识方法的使用优化。

Optimization of the Use of Consensus Methods for the Detection and Putative Identification of Peptides via Mass Spectrometry Using Protein Standard Mixtures.

作者信息

Sultana Tamanna, Jordan Rick, Lyons-Weiler James

机构信息

Bioinformatics Analysis Core, Genomics and Proteomics Core Laboratories and Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA.

出版信息

J Proteomics Bioinform. 2009 Jun 1;2(6):262-273. doi: 10.4172/jpb.1000085.

DOI:10.4172/jpb.1000085

PMID:19779596

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2749508/

Abstract

Correct identification of peptides and proteins in complex biological samples from proteomic mass-spectra is a challenging problem in bioinformatics. The sensitivity and specificity of identification algorithms depend on underlying scoring methods, some being more sensitive, and others more specific. For high-throughput, automated peptide identification, control over the algorithms' performance in terms of trade-off between sensitivity and specificity is desirable. Combinations of algorithms, called 'consensus methods', have been shown to provide more accurate results than individual algorithms. However, due to the proliferation of algorithms and their varied internal settings, a systematic understanding of relative performance of individual and consensus methods are lacking. We performed an in-depth analysis of various approaches to consensus scoring using known protein mixtures, and evaluated the performance of 2310 settings generated from consensus of three different search algorithms: Mascot, Sequest, and X!Tandem. Our findings indicate that the union of Mascot, Sequest, and X!Tandem performed well (considering overall accuracy), and methods using 80-99.9% protein probability and/or minimum 2 peptides and/or 0-50% minimum peptide probability for protein identification performed better (on average) among all consensus methods tested in terms of overall accuracy. The results also suggest method selection strategies to provide direct control over sensitivity and specificity.

摘要

从蛋白质组质谱中准确识别复杂生物样品中的肽段和蛋白质是生物信息学中的一个具有挑战性的问题。识别算法的灵敏度和特异性取决于基础评分方法，有些方法更灵敏，而有些方法更具特异性。对于高通量、自动化的肽段识别，在灵敏度和特异性之间进行权衡时控制算法的性能是很有必要的。被称为“共识方法”的算法组合已被证明能比单个算法提供更准确的结果。然而，由于算法的激增及其各种不同的内部设置，目前缺乏对单个方法和共识方法相对性能的系统理解。我们使用已知的蛋白质混合物对各种共识评分方法进行了深入分析，并评估了由三种不同搜索算法（Mascot、Sequest和X!Tandem）的共识产生的2310种设置的性能。我们的研究结果表明，Mascot、Sequest和X!Tandem的联合表现良好（考虑总体准确性），并且在所有测试的共识方法中，使用80 - 99.9%蛋白质概率和/或至少2个肽段和/或0 - 50%最小肽段概率进行蛋白质识别的方法在总体准确性方面（平均）表现更好。结果还提出了方法选择策略，以直接控制灵敏度和特异性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4da0/2749508/384c1a8d0f5e/nihms128333f1.jpg

相似文献

Optimization of the Use of Consensus Methods for the Detection and Putative Identification of Peptides via Mass Spectrometry Using Protein Standard Mixtures.使用蛋白质标准混合物通过质谱法检测和推定鉴定肽段的共识方法的使用优化。

J Proteomics Bioinform. 2009 Jun 1;2(6):262-273. doi: 10.4172/jpb.1000085.

Evaluation of the Consensus of Four Peptide Identification Algorithms for Tandem Mass Spectrometry Based Proteomics.基于串联质谱的蛋白质组学中四种肽段鉴定算法的一致性评估

J Proteomics Bioinform. 2010 Feb 5;3:39-47. doi: 10.4172/jpb.1000119.

Comparison of Mascot and X!Tandem performance for low and high accuracy mass spectrometry and the development of an adjusted Mascot threshold.用于低精度和高精度质谱分析的 Mascot 和 X!Tandem 性能比较以及 Mascot 调整阈值的开发

Mol Cell Proteomics. 2008 May;7(5):962-70. doi: 10.1074/mcp.M700293-MCP200. Epub 2008 Jan 23.

Binomial probability distribution model-based protein identification algorithm for tandem mass spectrometry utilizing peak intensity information.基于二项式概率分布模型的串联质谱利用峰强度信息进行蛋白质鉴定算法。

J Proteome Res. 2013 Jan 4;12(1):328-35. doi: 10.1021/pr300781t. Epub 2012 Nov 29.

Enhanced peptide quantification using spectral count clustering and cluster abundance.使用谱计数聚类和聚类丰度进行增强的肽定量。

BMC Bioinformatics. 2011 Oct 28;12:423. doi: 10.1186/1471-2105-12-423.

Accurate and sensitive peptide identification with Mascot Percolator.使用 Mascot Percolator 进行准确且灵敏的肽段鉴定。

J Proteome Res. 2009 Jun;8(6):3176-81. doi: 10.1021/pr800982s.

ProLuCID: An improved SEQUEST-like algorithm with enhanced sensitivity and specificity.ProLuCID：一种具有更高灵敏度和特异性的类似SEQUEST的改进算法。

J Proteomics. 2015 Nov 3;129:16-24. doi: 10.1016/j.jprot.2015.07.001. Epub 2015 Jul 11.

Optimization of Search Engines and Postprocessing Approaches to Maximize Peptide and Protein Identification for High-Resolution Mass Data.优化搜索引擎和后处理方法以最大化高分辨率质谱数据的肽段和蛋白质鉴定

J Proteome Res. 2015 Nov 6;14(11):4662-73. doi: 10.1021/acs.jproteome.5b00536. Epub 2015 Sep 30.

The effects of mass accuracy, data acquisition speed, and search algorithm choice on peptide identification rates in phosphoproteomics.质量精度、数据采集速度和搜索算法选择对磷酸蛋白质组学中肽段鉴定率的影响。

Anal Bioanal Chem. 2007 Nov;389(5):1409-19. doi: 10.1007/s00216-007-1563-x. Epub 2007 Sep 14.

An evaluation, comparison, and accurate benchmarking of several publicly available MS/MS search algorithms: sensitivity and specificity analysis.几种公开可用的串联质谱（MS/MS）搜索算法的评估、比较及准确基准测试：灵敏度和特异性分析

Proteomics. 2005 Aug;5(13):3475-90. doi: 10.1002/pmic.200500126.

引用本文的文献

Deep Learning-based MSMS Spectra Reduction in Support of Running Multiple Protein Search Engines on Cloud.基于深度学习的串联质谱（MSMS）谱图简化，以支持在云端运行多个蛋白质搜索引擎

Proceedings (IEEE Int Conf Bioinformatics Biomed). 2017 Nov;2017:1909-1914. doi: 10.1109/bibm.2017.8217951. Epub 2017 Dec 18.

Combining High-Resolution and Exact Calibration To Boost Statistical Power: A Well-Calibrated Score Function for High-Resolution MS2 Data.结合高分辨率和精确校准以提高统计功效：用于高分辨率 MS2 数据的校准良好的评分函数。

J Proteome Res. 2018 Nov 2;17(11):3644-3656. doi: 10.1021/acs.jproteome.8b00206. Epub 2018 Oct 18.

Practical and Efficient Searching in Proteomics: A Cross Engine Comparison.蛋白质组学中的实用高效搜索：跨引擎比较

Webmedcentral. 2013 Oct 1;4(10). doi: 10.9754/journal.wplus.2013.0052.

Combining results of multiple search engines in proteomics.在蛋白质组学中整合多个搜索引擎的结果。

Mol Cell Proteomics. 2013 Sep;12(9):2383-93. doi: 10.1074/mcp.R113.027797. Epub 2013 May 29.

MSblender: A probabilistic approach for integrating peptide identifications from multiple database search engines.MSblender：一种整合来自多个数据库搜索引擎的肽鉴定的概率方法。

J Proteome Res. 2011 Jul 1;10(7):2949-58. doi: 10.1021/pr2002116. Epub 2011 Apr 29.

Discovery of mouse spleen signaling responses to anthrax using label-free quantitative phosphoproteomics via mass spectrometry.利用质谱技术的无标记定量磷酸化蛋白质组学发现小鼠脾脏对炭疽的信号反应。

Mol Cell Proteomics. 2011 Mar;10(3):M110.000927. doi: 10.1074/mcp.M110.000927. Epub 2010 Dec 28.

Coherent pipeline for biomarker discovery using mass spectrometry and bioinformatics.基于质谱和生物信息学的生物标志物发现的连贯流水线。

BMC Bioinformatics. 2010 Aug 26;11:437. doi: 10.1186/1471-2105-11-437.

Evaluation of the Consensus of Four Peptide Identification Algorithms for Tandem Mass Spectrometry Based Proteomics.基于串联质谱的蛋白质组学中四种肽段鉴定算法的一致性评估

J Proteomics Bioinform. 2010 Feb 5;3:39-47. doi: 10.4172/jpb.1000119.

本文引用的文献

An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.一种将肽的串联质谱数据与蛋白质数据库中氨基酸序列相关联的方法。

J Am Soc Mass Spectrom. 1994 Nov;5(11):976-89. doi: 10.1016/1044-0305(94)80016-2.

Standards of excellence and open questions in cancer biomarker research: an informatics perspective.癌症生物标志物研究中的卓越标准与开放性问题：信息学视角

Cancer Inform. 2005;1(1):1-7.

Enhancing peptide identification confidence by combining search methods.通过结合搜索方法提高肽段鉴定的可信度。

J Proteome Res. 2008 Aug;7(8):3102-13. doi: 10.1021/pr700798h. Epub 2008 Jun 18.

Improving sensitivity by probabilistically combining results from multiple MS/MS search methodologies.通过概率性合并多种串联质谱（MS/MS）搜索方法的结果来提高灵敏度。

J Proteome Res. 2008 Jan;7(1):245-53. doi: 10.1021/pr070540w.

Optimization of filtering criterion for SEQUEST database searching to improve proteome coverage in shotgun proteomics.优化用于SEQUEST数据库搜索的过滤标准以提高鸟枪法蛋白质组学中的蛋白质组覆盖率。

BMC Bioinformatics. 2007 Aug 31;8:323. doi: 10.1186/1471-2105-8-323.

Comparative evaluation of tandem MS search algorithms using a target-decoy search strategy.使用目标-诱饵搜索策略对串联质谱搜索算法进行比较评估。

Mol Cell Proteomics. 2007 Sep;6(9):1599-608. doi: 10.1074/mcp.M600469-MCP200. Epub 2007 May 28.

Verification of single-peptide protein identifications by the application of complementary database search algorithms.通过应用互补数据库搜索算法对单肽段蛋白质鉴定结果进行验证。

J Biomol Tech. 2006 Dec;17(5):327-32.

Large scale analysis of MASCOT results using a Mass Accuracy-based THreshold (MATH) effectively improves data interpretation.使用基于质量准确度的阈值（MATH）对MASCOT结果进行大规模分析可有效改善数据解读。

J Proteome Res. 2005 Jul-Aug;4(4):1353-60. doi: 10.1021/pr0500509.

Proteomics. 2005 Aug;5(13):3475-90. doi: 10.1002/pmic.200500126.

Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book.使用串联质谱进行大规模数据库搜索：在书的后面查找答案。

Nat Methods. 2004 Dec;1(3):195-202. doi: 10.1038/nmeth725.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验