Suppr超能文献

NovoBoard:从头肽测序错误发现率和准确性的综合评估框架。

NovoBoard: A Comprehensive Framework for Evaluating the False Discovery Rate and Accuracy of De Novo Peptide Sequencing.

机构信息

Bioinformatics Solutions Inc, Waterloo, Ontario, Canada.

Bioinformatics Solutions Inc, Waterloo, Ontario, Canada; David R. Cheriton School of Computer Science, University of Waterloo, Ontario, Canada.

出版信息

Mol Cell Proteomics. 2024 Nov;23(11):100849. doi: 10.1016/j.mcpro.2024.100849. Epub 2024 Sep 24.

Abstract

De novo peptide sequencing is one of the most fundamental research areas in mass spectrometry-based proteomics. Many methods have often been evaluated using a couple of simple metrics that do not fully reflect their overall performance. Moreover, there has not been an established method to estimate the false discovery rate (FDR) of de novo peptide-spectrum matches. Here we propose NovoBoard, a comprehensive framework to evaluate the performance of de novo peptide-sequencing methods. The framework consists of diverse benchmark datasets (including tryptic, nontryptic, immunopeptidomics, and different species) and a standard set of accuracy metrics to evaluate the fragment ions, amino acids, and peptides of the de novo results. More importantly, a new approach is designed to evaluate de novo peptide-sequencing methods on target-decoy spectra and to estimate and validate their FDRs. Our FDR estimation provides valuable information to assess the reliability of new peptides identified by de novo sequencing tools, especially when no ground-truth information is available to evaluate their accuracy. The FDR estimation can also be used to evaluate the capability of de novo peptide sequencing tools to distinguish between de novo peptide-spectrum matches and random matches. Our results thoroughly reveal the strengths and weaknesses of different de novo peptide-sequencing methods and how their performances depend on specific applications and the types of data.

摘要

从头肽序列分析是基于质谱的蛋白质组学中最基本的研究领域之一。许多方法通常使用几个简单的指标进行评估,这些指标并不能完全反映它们的整体性能。此外,还没有一种方法可以估计从头肽谱匹配的假发现率 (FDR)。在这里,我们提出了 NovoBoard,这是一个全面的评估从头肽测序方法性能的框架。该框架由多样化的基准数据集(包括胰蛋白酶、非胰蛋白酶、免疫肽组学和不同物种)和一套标准的准确性指标组成,用于评估从头结果的片段离子、氨基酸和肽。更重要的是,设计了一种新方法来评估靶向诱饵谱上的从头肽测序方法,并估计和验证它们的 FDR。我们的 FDR 估计为评估从头测序工具识别的新肽的可靠性提供了有价值的信息,尤其是在没有真实信息可用于评估其准确性时。FDR 估计还可用于评估从头肽测序工具区分从头肽谱匹配和随机匹配的能力。我们的结果彻底揭示了不同从头肽测序方法的优缺点,以及它们的性能如何取决于特定的应用和数据类型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/67fe/11532909/866bbd4640b4/ga1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验