Suppr超能文献

蛋白质组学中共识谱生成方法的综合评价

A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics.

机构信息

Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, 400065 Chongqing, China.

Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States.

出版信息

J Proteome Res. 2022 Jun 3;21(6):1566-1574. doi: 10.1021/acs.jproteome.2c00069. Epub 2022 May 13.

Abstract

Spectrum clustering is a powerful strategy to minimize redundant mass spectra by grouping them based on similarity, with the aim of forming groups of mass spectra from the same repeatedly measured analytes. Each such group of near-identical spectra can be represented by its so-called consensus spectrum for downstream processing. Although several algorithms for spectrum clustering have been adequately benchmarked and tested, the influence of the consensus spectrum generation step is rarely evaluated. Here, we present an implementation and benchmark of common consensus spectrum algorithms, including spectrum averaging, spectrum binning, the most similar spectrum, and the best-identified spectrum. We have analyzed diverse public data sets using two different clustering algorithms (spectra-cluster and MaRaCluster) to evaluate how the consensus spectrum generation procedure influences downstream peptide identification. The BEST and BIN methods were found the most reliable methods for consensus spectrum generation, including for data sets with post-translational modifications (PTM) such as phosphorylation. All source code and data of the present study are freely available on GitHub at https://github.com/statisticalbiotechnology/representative-spectra-benchmark.

摘要

光谱聚类是一种强大的策略,可以通过基于相似性对它们进行分组来最小化冗余的质谱,目的是形成来自同一反复测量分析物的质谱组。每个这样的近同谱组都可以通过其所谓的共识谱来表示,以便进行下游处理。尽管已经充分基准测试和测试了几种用于光谱聚类的算法,但很少评估共识谱生成步骤的影响。在这里,我们提出了常见共识谱算法的实现和基准测试,包括光谱平均、光谱-bin 化、最相似谱和最佳鉴定谱。我们使用两种不同的聚类算法(spectra-cluster 和 MaRaCluster)分析了多样化的公共数据集,以评估共识谱生成过程如何影响下游肽鉴定。BEST 和 BIN 方法被发现是生成共识谱最可靠的方法,包括对具有翻译后修饰(PTM)如磷酸化的数据集。本研究的所有源代码和数据都可以在 GitHub 上免费获得,网址为 https://github.com/statisticalbiotechnology/representative-spectra-benchmark。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b45d/9171829/883ce49e89b0/pr2c00069_0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验