Suppr超能文献

串联质谱的 CID、ETD 和 CID/ETD 对的生成函数:在数据库搜索中的应用。

The generating function of CID, ETD, and CID/ETD pairs of tandem mass spectra: applications to database search.

机构信息

Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA 92093, USA.

出版信息

Mol Cell Proteomics. 2010 Dec;9(12):2840-52. doi: 10.1074/mcp.M110.003731. Epub 2010 Sep 9.

Abstract

Recent emergence of new mass spectrometry techniques (e.g. electron transfer dissociation, ETD) and improved availability of additional proteases (e.g. Lys-N) for protein digestion in high-throughput experiments raised the challenge of designing new algorithms for interpreting the resulting new types of tandem mass (MS/MS) spectra. Traditional MS/MS database search algorithms such as SEQUEST and Mascot were originally designed for collision induced dissociation (CID) of tryptic peptides and are largely based on expert knowledge about fragmentation of tryptic peptides (rather than machine learning techniques) to design CID-specific scoring functions. As a result, the performance of these algorithms is suboptimal for new mass spectrometry technologies or nontryptic peptides. We recently proposed the generating function approach (MS-GF) for CID spectra of tryptic peptides. In this study, we extend MS-GF to automatically derive scoring parameters from a set of annotated MS/MS spectra of any type (e.g. CID, ETD, etc.), and present a new database search tool MS-GFDB based on MS-GF. We show that MS-GFDB outperforms Mascot for ETD spectra or peptides digested with Lys-N. For example, in the case of ETD spectra, the number of tryptic and Lys-N peptides identified by MS-GFDB increased by a factor of 2.7 and 2.6 as compared with Mascot. Moreover, even following a decade of Mascot developments for analyzing CID spectra of tryptic peptides, MS-GFDB (that is not particularly tailored for CID spectra or tryptic peptides) resulted in 28% increase over Mascot in the number of peptide identifications. Finally, we propose a statistical framework for analyzing multiple spectra from the same precursor (e.g. CID/ETD spectral pairs) and assigning p values to peptide-spectrum-spectrum matches.

摘要

近年来,新的质谱技术(如电子转移解离,ETD)的出现以及可用于高通量实验的额外蛋白酶(如 Lys-N)的可用性的提高,给解释新型串联质谱(MS/MS)谱带来了新的挑战。传统的 MS/MS 数据库搜索算法,如 SEQUEST 和 Mascot,最初是为胰蛋白酶肽的碰撞诱导解离(CID)设计的,并且主要基于关于胰蛋白酶肽片段的专家知识(而不是机器学习技术)来设计 CID 特异性评分函数。因此,这些算法对于新的质谱技术或非胰蛋白酶肽的性能并不理想。我们最近提出了用于胰蛋白酶肽 CID 谱的生成函数方法(MS-GF)。在本研究中,我们将 MS-GF 扩展到从任何类型的一组注释 MS/MS 谱(例如 CID、ETD 等)自动推导评分参数,并基于 MS-GF 提出了一种新的数据库搜索工具 MS-GFDB。我们表明,MS-GFDB 在 ETD 谱或用 Lys-N 消化的肽的 Mascot 表现更好。例如,在 ETD 谱的情况下,MS-GFDB 鉴定的胰蛋白酶和 Lys-N 肽的数量分别比 Mascot 增加了 2.7 倍和 2.6 倍。此外,即使在经过十年的 Mascot 开发用于分析胰蛋白酶肽的 CID 谱之后,MS-GFDB(并非特别针对 CID 谱或胰蛋白酶肽进行定制)在肽鉴定数量上比 Mascot 增加了 28%。最后,我们提出了一种用于分析来自同一前体的多个谱(例如 CID/ETD 谱对)的统计框架,并为肽-谱-谱匹配分配 p 值。

相似文献

4
UniNovo: a universal tool for de novo peptide sequencing.UniNovo:从头测序肽的通用工具。
Bioinformatics. 2013 Aug 15;29(16):1953-62. doi: 10.1093/bioinformatics/btt338. Epub 2013 Jun 12.

引用本文的文献

1
A Review of Protein Inference.蛋白质推断综述。
Methods Mol Biol. 2025;2859:53-64. doi: 10.1007/978-1-0716-4152-1_4.
9
Enhancing Open Modification Searches via a Combined Approach Facilitated by Ursgal.通过 Ursgal 辅助的联合方法增强开放修饰搜索。
J Proteome Res. 2021 Apr 2;20(4):1986-1996. doi: 10.1021/acs.jproteome.0c00799. Epub 2021 Jan 29.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验