串联质谱的 CID、ETD 和 CID/ETD 对的生成函数：在数据库搜索中的应用。

The generating function of CID, ETD, and CID/ETD pairs of tandem mass spectra: applications to database search.

机构信息

Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA 92093, USA.

出版信息

Mol Cell Proteomics. 2010 Dec;9(12):2840-52. doi: 10.1074/mcp.M110.003731. Epub 2010 Sep 9.

DOI:10.1074/mcp.M110.003731

PMID:20829449

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3101864/

Abstract

Recent emergence of new mass spectrometry techniques (e.g. electron transfer dissociation, ETD) and improved availability of additional proteases (e.g. Lys-N) for protein digestion in high-throughput experiments raised the challenge of designing new algorithms for interpreting the resulting new types of tandem mass (MS/MS) spectra. Traditional MS/MS database search algorithms such as SEQUEST and Mascot were originally designed for collision induced dissociation (CID) of tryptic peptides and are largely based on expert knowledge about fragmentation of tryptic peptides (rather than machine learning techniques) to design CID-specific scoring functions. As a result, the performance of these algorithms is suboptimal for new mass spectrometry technologies or nontryptic peptides. We recently proposed the generating function approach (MS-GF) for CID spectra of tryptic peptides. In this study, we extend MS-GF to automatically derive scoring parameters from a set of annotated MS/MS spectra of any type (e.g. CID, ETD, etc.), and present a new database search tool MS-GFDB based on MS-GF. We show that MS-GFDB outperforms Mascot for ETD spectra or peptides digested with Lys-N. For example, in the case of ETD spectra, the number of tryptic and Lys-N peptides identified by MS-GFDB increased by a factor of 2.7 and 2.6 as compared with Mascot. Moreover, even following a decade of Mascot developments for analyzing CID spectra of tryptic peptides, MS-GFDB (that is not particularly tailored for CID spectra or tryptic peptides) resulted in 28% increase over Mascot in the number of peptide identifications. Finally, we propose a statistical framework for analyzing multiple spectra from the same precursor (e.g. CID/ETD spectral pairs) and assigning p values to peptide-spectrum-spectrum matches.

摘要

近年来，新的质谱技术（如电子转移解离，ETD）的出现以及可用于高通量实验的额外蛋白酶（如 Lys-N）的可用性的提高，给解释新型串联质谱（MS/MS）谱带来了新的挑战。传统的 MS/MS 数据库搜索算法，如 SEQUEST 和 Mascot，最初是为胰蛋白酶肽的碰撞诱导解离（CID）设计的，并且主要基于关于胰蛋白酶肽片段的专家知识（而不是机器学习技术）来设计 CID 特异性评分函数。因此，这些算法对于新的质谱技术或非胰蛋白酶肽的性能并不理想。我们最近提出了用于胰蛋白酶肽 CID 谱的生成函数方法（MS-GF）。在本研究中，我们将 MS-GF 扩展到从任何类型的一组注释 MS/MS 谱（例如 CID、ETD 等）自动推导评分参数，并基于 MS-GF 提出了一种新的数据库搜索工具 MS-GFDB。我们表明，MS-GFDB 在 ETD 谱或用 Lys-N 消化的肽的 Mascot 表现更好。例如，在 ETD 谱的情况下，MS-GFDB 鉴定的胰蛋白酶和 Lys-N 肽的数量分别比 Mascot 增加了 2.7 倍和 2.6 倍。此外，即使在经过十年的 Mascot 开发用于分析胰蛋白酶肽的 CID 谱之后，MS-GFDB（并非特别针对 CID 谱或胰蛋白酶肽进行定制）在肽鉴定数量上比 Mascot 增加了 28％。最后，我们提出了一种用于分析来自同一前体的多个谱（例如 CID/ETD 谱对）的统计框架，并为肽-谱-谱匹配分配 p 值。

相似文献

The generating function of CID, ETD, and CID/ETD pairs of tandem mass spectra: applications to database search.串联质谱的 CID、ETD 和 CID/ETD 对的生成函数：在数据库搜索中的应用。

Mol Cell Proteomics. 2010 Dec;9(12):2840-52. doi: 10.1074/mcp.M110.003731. Epub 2010 Sep 9.

Enhanced peptide identification by electron transfer dissociation using an improved Mascot Percolator.采用改进的 Mascot Percolator 进行电子转移解离增强肽鉴定。

Mol Cell Proteomics. 2012 Aug;11(8):478-91. doi: 10.1074/mcp.O111.014522. Epub 2012 Apr 6.

High-throughput database search and large-scale negative polarity liquid chromatography-tandem mass spectrometry with ultraviolet photodissociation for complex proteomic samples.高通量数据库搜索和大规模负相液相色谱-串联质谱联用与紫外光解用于复杂蛋白质组样品。

Mol Cell Proteomics. 2013 Sep;12(9):2604-14. doi: 10.1074/mcp.O113.028258. Epub 2013 May 21.

UniNovo: a universal tool for de novo peptide sequencing.UniNovo：从头测序肽的通用工具。

Bioinformatics. 2013 Aug 15;29(16):1953-62. doi: 10.1093/bioinformatics/btt338. Epub 2013 Jun 12.

Proteome analysis of Sorangium cellulosum employing 2D-HPLC-MS/MS and improved database searching strategies for CID and ETD fragment spectra.利用二维高效液相色谱-串联质谱法（2D-HPLC-MS/MS）对纤维堆囊菌进行蛋白质组分析，并改进了用于碰撞诱导解离（CID）和电子转移解离（ETD）碎片谱的数据库搜索策略。

J Proteome Res. 2009 Sep;8(9):4350-61. doi: 10.1021/pr9004647.

Improving collision induced dissociation (CID), high energy collision dissociation (HCD), and electron transfer dissociation (ETD) fourier transform MS/MS degradome-peptidome identifications using high accuracy mass information.利用高精度质量信息提高碰撞诱导解离（CID）、高能碰撞解离（HCD）和电子转移解离（ETD）傅里叶变换 MS/MS 降解组-肽组鉴定。

J Proteome Res. 2012 Feb 3;11(2):668-77. doi: 10.1021/pr200597j. Epub 2011 Dec 1.

Improved peptide identification by targeted fragmentation using CID, HCD and ETD on an LTQ-Orbitrap Velos.在 LTQ-Orbitrap Velos 上使用 CID、HCD 和 ETD 进行靶向碎裂可提高肽段鉴定。

J Proteome Res. 2011 May 6;10(5):2377-88. doi: 10.1021/pr1011729. Epub 2011 Apr 1.

A new probabilistic database search algorithm for ETD spectra.一种用于电子转移解离（ETD）光谱的新型概率数据库搜索算法。

J Proteome Res. 2009 Jun;8(6):3198-205. doi: 10.1021/pr900153b.

pNovo+: de novo peptide sequencing using complementary HCD and ETD tandem mass spectra.pNovo+：使用互补的 HCD 和 ETD 串联质谱进行从头多肽测序。

J Proteome Res. 2013 Feb 1;12(2):615-25. doi: 10.1021/pr3006843. Epub 2012 Dec 28.

Strategies in protein sequencing and characterization: multi-enzyme digestion coupled with alternate CID/ETD tandem mass spectrometry.蛋白质测序与表征策略：多酶消化结合交替 CID/ETD 串联质谱法

Anal Chim Acta. 2015 Jan 7;854:106-17. doi: 10.1016/j.aca.2014.10.053. Epub 2014 Nov 4.

引用本文的文献

A Review of Protein Inference.蛋白质推断综述。

Methods Mol Biol. 2025;2859:53-64. doi: 10.1007/978-1-0716-4152-1_4.

Characterization of peptide-protein relationships in protein ambiguity groups via bipartite graphs.通过二分图刻画蛋白质歧义组中肽-蛋白关系。

PLoS One. 2022 Oct 21;17(10):e0276401. doi: 10.1371/journal.pone.0276401. eCollection 2022.

Systematic exploration of dynamic splicing networks reveals conserved multistage regulators of neurogenesis.系统探索动态剪接网络揭示了神经发生的保守多阶段调控因子。

Mol Cell. 2022 Aug 18;82(16):2982-2999.e14. doi: 10.1016/j.molcel.2022.06.036. Epub 2022 Jul 31.

Dataset containing physiological amounts of spike-in proteins into murine C2C12 background as a ground truth quantitative LC-MS/MS reference.数据集包含注入到小鼠C2C12背景中的生理量的掺入蛋白，作为真实定量液相色谱-串联质谱参考。

Data Brief. 2022 Jul 4;43:108435. doi: 10.1016/j.dib.2022.108435. eCollection 2022 Aug.

Endofin is required for HD-PTP and ESCRT-0 interdependent endosomal sorting of ubiquitinated transmembrane cargoes.Endofin是泛素化跨膜货物的HD-PTP和ESCRT-0相互依赖的内体分选所必需的。

iScience. 2021 Oct 14;24(11):103274. doi: 10.1016/j.isci.2021.103274. eCollection 2021 Nov 19.

DIAmeter: matching peptides to data-independent acquisition mass spectrometry data.DIAmeter：将肽段与数据非依赖采集质谱数据相匹配。

Bioinformatics. 2021 Jul 12;37(Suppl_1):i434-i442. doi: 10.1093/bioinformatics/btab284.

Proteome Discoverer-A Community Enhanced Data Processing Suite for Protein Informatics.蛋白质组学发现者——一个由社区增强的蛋白质信息学数据处理套件。

Proteomes. 2021 Mar 23;9(1):15. doi: 10.3390/proteomes9010015.

Protein context shapes the specificity of SH3 domain-mediated interactions in vivo.蛋白质环境决定 SH3 结构域介导的体内相互作用的特异性。

Nat Commun. 2021 Mar 12;12(1):1597. doi: 10.1038/s41467-021-21873-2.

Enhancing Open Modification Searches via a Combined Approach Facilitated by Ursgal.通过 Ursgal 辅助的联合方法增强开放修饰搜索。

J Proteome Res. 2021 Apr 2;20(4):1986-1996. doi: 10.1021/acs.jproteome.0c00799. Epub 2021 Jan 29.

Proteoform Identification by Combining RNA-Seq and Top-Down Mass Spectrometry.通过 RNA-Seq 和自上而下的质谱联用进行蛋白形式鉴定。

J Proteome Res. 2021 Jan 1;20(1):261-269. doi: 10.1021/acs.jproteome.0c00369. Epub 2020 Nov 12.

本文引用的文献

An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.一种将肽的串联质谱数据与蛋白质数据库中氨基酸序列相关联的方法。

J Am Soc Mass Spectrom. 1994 Nov;5(11):976-89. doi: 10.1016/1044-0305(94)80016-2.

Gapped spectral dictionaries and their applications for database searches of tandem mass spectra.有间隙的光谱字典及其在串联质谱数据库搜索中的应用。

Mol Cell Proteomics. 2011 Jun;10(6):M110.002220. doi: 10.1074/mcp.M110.002220. Epub 2011 Mar 28.

De novo peptide sequencing by tandem MS using complementary CID and electron transfer dissociation.采用互补的 CID 和电子转移解离的串联 MS 进行从头肽测序。

Electrophoresis. 2009 Nov;30(21):3736-47. doi: 10.1002/elps.200900332.

Spectrum fusion: using multiple mass spectra for de novo Peptide sequencing.光谱融合：使用多个质谱进行从头肽测序。

J Comput Biol. 2009 Aug;16(8):1169-82. doi: 10.1089/cmb.2009.0122.

Phosphopeptide fragmentation and analysis by mass spectrometry.磷酸肽的质谱碎裂与分析

J Mass Spectrom. 2009 Jun;44(6):861-78. doi: 10.1002/jms.1599.

Identification of protein O-GlcNAcylation sites using electron transfer dissociation mass spectrometry on native peptides.使用电子转移解离质谱法对天然肽段进行蛋白质O-连接N-乙酰葡糖胺化位点鉴定。

Proc Natl Acad Sci U S A. 2009 Jun 2;106(22):8894-9. doi: 10.1073/pnas.0900288106. Epub 2009 May 19.

Electron transfer dissociation in conjunction with collision activation to investigate the Drosophila melanogaster phosphoproteome.结合碰撞活化的电子转移解离用于研究黑腹果蝇磷酸化蛋白质组。

J Proteome Res. 2009 Jun;8(6):2633-9. doi: 10.1021/pr800834e.

Lys-N and trypsin cover complementary parts of the phosphoproteome in a refined SCX-based approach.在一种优化的基于强阳离子交换（SCX）的方法中，赖氨酰-N和胰蛋白酶覆盖了磷酸化蛋白质组的互补部分。

Anal Chem. 2009 Jun 1;81(11):4493-501. doi: 10.1021/ac9004309.

Collisions or electrons? Protein sequence analysis in the 21st century.碰撞还是电子？21世纪的蛋白质序列分析。

Anal Chem. 2009 May 1;81(9):3208-15. doi: 10.1021/ac802330b.

Post-acquisition ETD spectral processing for increased peptide identifications.采集后电子转移解离（ETD）光谱处理以增加肽段鉴定数量

J Am Soc Mass Spectrom. 2009 Aug;20(8):1435-40. doi: 10.1016/j.jasms.2009.03.006. Epub 2009 Mar 14.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验