利用已知蛋白含量的样本评估 shotgun 蛋白质组学中分配给肽-谱匹配的分数的统计校准。

On using samples of known protein content to assess the statistical calibration of scores assigned to peptide-spectrum matches in shotgun proteomics.

机构信息

Center for Biomembrane Research, Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden.

出版信息

J Proteome Res. 2011 May 6;10(5):2671-8. doi: 10.1021/pr1012619. Epub 2011 Mar 24.

DOI:10.1021/pr1012619

PMID:21391616

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3268674/

Abstract

In shotgun proteomics, the quality of a hypothesized match between an observed spectrum and a peptide sequence is quantified by a score function. Because the score function lies at the heart of any peptide identification pipeline, this function greatly affects the final results of a proteomics assay. Consequently, valid statistical methods for assessing the quality of a given score function are extremely important. Previously, several research groups have used samples of known protein composition to assess the quality of a given score function. We demonstrate that this approach is problematic, because the outcome can depend on factors other than the score function itself. We then propose an alternative use of the same type of data to validate a score function. The central idea of our approach is that database matches that are not explained by any protein in the purified sample comprise a robust representation of incorrect matches. We apply our alternative assessment scheme to several commonly used score functions, and we show that our approach generates a reproducible measure of the calibration of a given peptide identification method. Furthermore, we show how our quality test can be useful in the development of novel score functions.

摘要

在 shotgun 蛋白质组学中，通过评分函数来量化观测到的光谱与肽序列之间假设匹配的质量。由于评分函数是任何肽鉴定管道的核心，因此该函数极大地影响了蛋白质组学分析的最终结果。因此，评估给定评分函数质量的有效统计方法非常重要。以前，一些研究小组使用已知蛋白质组成的样本来评估给定评分函数的质量。我们证明了这种方法存在问题，因为结果可能取决于评分函数本身以外的因素。然后，我们提出了一种利用相同类型数据来验证评分函数的替代方法。我们方法的核心思想是，数据库匹配不能由纯化样品中的任何蛋白质来解释，这些匹配构成了错误匹配的可靠代表。我们将我们的替代评估方案应用于几种常用的评分函数，并表明我们的方法可以生成给定肽鉴定方法校准的可重复度量。此外，我们还展示了我们的质量测试如何在新型评分函数的开发中发挥作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb34/3268674/29d1e5fe98d0/nihms351498f1.jpg

相似文献

On using samples of known protein content to assess the statistical calibration of scores assigned to peptide-spectrum matches in shotgun proteomics.利用已知蛋白含量的样本评估 shotgun 蛋白质组学中分配给肽-谱匹配的分数的统计校准。

J Proteome Res. 2011 May 6;10(5):2671-8. doi: 10.1021/pr1012619. Epub 2011 Mar 24.

Quality assessments of peptide-spectrum matches in shotgun proteomics.肽谱匹配在鸟枪法蛋白质组学中的质量评估。

Proteomics. 2011 Mar;11(6):1086-93. doi: 10.1002/pmic.201000432. Epub 2011 Feb 7.

Determining the calibration of confidence estimation procedures for unique peptides in shotgun proteomics.确定鸟枪法蛋白质组学中独特肽段的置信度估计程序的校准

J Proteomics. 2013 Mar 27;80:123-31. doi: 10.1016/j.jprot.2012.12.007. Epub 2012 Dec 23.

On the importance of well-calibrated scores for identifying shotgun proteomics spectra.关于校准良好的分数在识别鸟枪法蛋白质组学谱图中的重要性。

J Proteome Res. 2015 Feb 6;14(2):1147-60. doi: 10.1021/pr5010983. Epub 2014 Dec 17.

Calculation of False Discovery Rate for Peptide and Protein Identification.肽段和蛋白质鉴定的错误发现率计算

Methods Mol Biol. 2020;2051:145-159. doi: 10.1007/978-1-4939-9744-2_6.

Tailor: A Nonparametric and Rapid Score Calibration Method for Database Search-Based Peptide Identification in Shotgun Proteomics.裁缝：一种基于数据库搜索的 shotgun 蛋白质组学肽鉴定的非参数和快速评分校准方法。

J Proteome Res. 2020 Apr 3;19(4):1481-1490. doi: 10.1021/acs.jproteome.9b00736. Epub 2020 Mar 25.

Systematic Errors in Peptide and Protein Identification and Quantification by Modified Peptides.修饰肽段在肽和蛋白质鉴定及定量中的系统误差

Mol Cell Proteomics. 2016 Aug;15(8):2791-801. doi: 10.1074/mcp.M115.055103. Epub 2016 May 23.

Statistical calibration of the SEQUEST XCorr function.SEQUEST XCorr函数的统计校准。

J Proteome Res. 2009 Apr;8(4):2106-13. doi: 10.1021/pr8011107.

MSblender: A probabilistic approach for integrating peptide identifications from multiple database search engines.MSblender：一种整合来自多个数据库搜索引擎的肽鉴定的概率方法。

J Proteome Res. 2011 Jul 1;10(7):2949-58. doi: 10.1021/pr2002116. Epub 2011 Apr 29.

Adaptive discriminant function analysis and reranking of MS/MS database search results for improved peptide identification in shotgun proteomics.用于鸟枪法蛋白质组学中改进肽段鉴定的自适应判别函数分析及串联质谱数据库搜索结果的重排

J Proteome Res. 2008 Nov;7(11):4878-89. doi: 10.1021/pr800484x. Epub 2008 Sep 13.

引用本文的文献

Assessment of false discovery rate control in tandem mass spectrometry analysis using entrapment.使用截留法进行串联质谱分析时假发现率控制的评估

Nat Methods. 2025 Jun 16. doi: 10.1038/s41592-025-02719-x.

Query Mix-Max Method for FDR Estimation Supported by Entrapment Queries.由截留查询支持的用于错误发现率（FDR）估计的查询混合最大化方法。

J Proteome Res. 2025 Mar 7;24(3):1135-1147. doi: 10.1021/acs.jproteome.4c00744. Epub 2025 Feb 5.

PyViscount: Validating False Discovery Rate Estimation Methods via Random Search Space Partition.PyViscount：通过随机搜索空间划分验证错误发现率估计方法

J Proteome Res. 2025 Mar 7;24(3):1118-1134. doi: 10.1021/acs.jproteome.4c00743. Epub 2025 Feb 5.

Enhanced sensitivity and scalability with a Chip-Tip workflow enables deep single-cell proteomics.通过芯片-尖端工作流程提高灵敏度和可扩展性，实现深度单细胞蛋白质组学。

Nat Methods. 2025 Mar;22(3):499-509. doi: 10.1038/s41592-024-02558-2. Epub 2025 Jan 16.

Assessment of false discovery rate control in tandem mass spectrometry analysis using entrapment.使用截留法对串联质谱分析中的错误发现率控制进行评估。

bioRxiv. 2025 Jan 21:2024.06.01.596967. doi: 10.1101/2024.06.01.596967.

Ion entropy and accurate entropy-based FDR estimation in metabolomics.代谢组学中的离子熵和基于准确熵的 FDR 估计。

Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae056.

Ultra-fast label-free quantification and comprehensive proteome coverage with narrow-window data-independent acquisition.通过窄窗口数据非依赖采集实现超快速无标记定量和全面蛋白质组覆盖

Nat Biotechnol. 2024 Dec;42(12):1855-1866. doi: 10.1038/s41587-023-02099-7. Epub 2024 Feb 1.

DIA-MS2pep: a library-free framework for comprehensive peptide identification from data-independent acquisition data.DIA-MS2pep：一种用于从非数据依赖型采集数据中进行全面肽段鉴定的无库框架。

Biophys Rep. 2022 Dec 31;8(5-6):253-268. doi: 10.52601/bpr.2022.220011.

DIAmeter: matching peptides to data-independent acquisition mass spectrometry data.DIAmeter：将肽段与数据非依赖采集质谱数据相匹配。

Bioinformatics. 2021 Jul 12;37(Suppl_1):i434-i442. doi: 10.1093/bioinformatics/btab284.

Deep learning for peptide identification from metaproteomics datasets.基于深度学习的宏蛋白质组学数据肽段鉴定。

J Proteomics. 2021 Sep 15;247:104316. doi: 10.1016/j.jprot.2021.104316. Epub 2021 Jul 8.

本文引用的文献

An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.一种将肽的串联质谱数据与蛋白质数据库中氨基酸序列相关联的方法。

J Am Soc Mass Spectrom. 1994 Nov;5(11):976-89. doi: 10.1016/1044-0305(94)80016-2.

The generating function of CID, ETD, and CID/ETD pairs of tandem mass spectra: applications to database search.串联质谱的 CID、ETD 和 CID/ETD 对的生成函数：在数据库搜索中的应用。

Mol Cell Proteomics. 2010 Dec;9(12):2840-52. doi: 10.1074/mcp.M110.003731. Epub 2010 Sep 9.

A regularized method for peptide quantification.一种用于肽定量的正则化方法。

J Proteome Res. 2010 May 7;9(5):2705-12. doi: 10.1021/pr100181g.

Unbiased statistical analysis for multi-stage proteomic search strategies.多阶段蛋白质组学搜索策略的无偏统计分析。

J Proteome Res. 2010 Feb 5;9(2):700-7. doi: 10.1021/pr900256v.

Repeatability and reproducibility in proteomic identifications by liquid chromatography-tandem mass spectrometry.液相色谱-串联质谱法在蛋白质组学鉴定中的可重复性和可再现性。

J Proteome Res. 2010 Feb 5;9(2):761-76. doi: 10.1021/pr9006365.

Interlaboratory study characterizing a yeast performance standard for benchmarking LC-MS platform performance.用于基准 LC-MS 平台性能的酵母性能标准的实验室间研究。

Mol Cell Proteomics. 2010 Feb;9(2):242-54. doi: 10.1074/mcp.M900222-MCP200. Epub 2009 Oct 26.

False discovery rates of protein identifications: a strike against the two-peptide rule.蛋白质鉴定的错误发现率：对双肽规则的一次打击。

J Proteome Res. 2009 Sep;8(9):4173-81. doi: 10.1021/pr9004794.

A HUPO test sample study reveals common problems in mass spectrometry-based proteomics.一项人类蛋白质组组织（HUPO）测试样本研究揭示了基于质谱的蛋白质组学中的常见问题。

Nat Methods. 2009 Jun;6(6):423-30. doi: 10.1038/nmeth.1333.

Rapid and accurate peptide identification from tandem mass spectra.从串联质谱中快速准确地鉴定肽段。

J Proteome Res. 2008 Jul;7(7):3022-7. doi: 10.1021/pr800127y. Epub 2008 May 28.

Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics.基于半监督模型的质谱蛋白质组学中肽段鉴定的验证

J Proteome Res. 2008 Jan;7(1):254-65. doi: 10.1021/pr070542g. Epub 2007 Dec 27.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用已知蛋白含量的样本评估 shotgun 蛋白质组学中分配给肽-谱匹配的分数的统计校准。

On using samples of known protein content to assess the statistical calibration of scores assigned to peptide-spectrum matches in shotgun proteomics.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献