HMMatch：使用隐马尔可夫模型通过串联质谱的谱图匹配进行肽段鉴定。

HMMatch: peptide identification by spectral matching of tandem mass spectra using hidden Markov models.

作者信息

Wu Xue, Tseng Chau-Wen, Edwards Nathan

机构信息

Department of Computer Science, University of Maryland, College Park, MD 20742, USA.

出版信息

J Comput Biol. 2007 Oct;14(8):1025-43. doi: 10.1089/cmb.2007.0071.

DOI:10.1089/cmb.2007.0071

PMID:17985986

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3772688/

Abstract

Peptide identification by tandem mass spectrometry is the dominant proteomics workflow for protein characterization in complex samples. The peptide fragmentation spectra generated by these workflows exhibit characteristic fragmentation patterns that can be used to identify the peptide. In other fields, where the compounds of interest do not have the convenient linear structure of peptides, fragmentation spectra are identified by comparing new spectra with libraries of identified spectra, an approach called spectral matching. In contrast to sequence-based tandem mass spectrometry search engines used for peptides, spectral matching can make use of the intensities of fragment peaks in library spectra to assess the quality of a match. We evaluate a hidden Markov model approach (HMMatch) to spectral matching, in which many examples of a peptide's fragmentation spectrum are summarized in a generative probabilistic model that captures the consensus and variation of each peak's intensity. We demonstrate that HMMatch has good specificity and superior sensitivity, compared to sequence database search engines such as X!Tandem. HMMatch achieves good results from relatively few training spectra, is fast to train, and can evaluate many spectra per second. A statistical significance model permits HMMatch scores to be compared with each other, and with other peptide identification tools, on a unified scale. HMMatch shows a similar degree of concordance with X!Tandem, Mascot, and NIST's MS Search, as they do with each other, suggesting that each tool can assign peptides to spectra that the others miss. Finally, we show that it is possible to extrapolate HMMatch models beyond a single peptide's training spectra to the spectra of related peptides, expanding the application of spectral matching techniques beyond the set of peptides previously observed.

摘要

通过串联质谱进行肽段鉴定是复杂样品中蛋白质表征的主要蛋白质组学工作流程。这些工作流程产生的肽段碎裂谱呈现出可用于鉴定肽段的特征性碎裂模式。在其他领域，目标化合物不具有肽段那样方便的线性结构，通过将新谱图与已鉴定谱图的库进行比较来鉴定碎裂谱，这种方法称为谱图匹配。与用于肽段的基于序列的串联质谱搜索引擎不同，谱图匹配可以利用库谱图中碎片峰的强度来评估匹配质量。我们评估了一种用于谱图匹配的隐马尔可夫模型方法（HMMatch），其中肽段碎裂谱的许多示例被总结在一个生成概率模型中，该模型捕获每个峰强度的一致性和变化。我们证明，与诸如X!Tandem等序列数据库搜索引擎相比，HMMatch具有良好的特异性和卓越的灵敏度。HMMatch从相对较少的训练谱图中就能取得良好结果，训练速度快，并且每秒可以评估许多谱图。一个统计显著性模型允许在统一尺度上比较HMMatch分数以及与其他肽段鉴定工具的分数。HMMatch与X!Tandem、Mascot和NIST的MS Search显示出相似程度的一致性，就像它们彼此之间的一致性一样，这表明每个工具都可以将肽段分配到其他工具遗漏的谱图中。最后，我们表明可以将HMMatch模型从单个肽段的训练谱图外推到相关肽段的谱图，从而将谱图匹配技术的应用扩展到先前观察到的肽段集合之外。

相似文献

HMMatch: peptide identification by spectral matching of tandem mass spectra using hidden Markov models.HMMatch：使用隐马尔可夫模型通过串联质谱的谱图匹配进行肽段鉴定。

J Comput Biol. 2007 Oct;14(8):1025-43. doi: 10.1089/cmb.2007.0071.

Improving Peptide-Spectrum Matching by Fragmentation Prediction Using Hidden Markov Models.利用隐马尔可夫模型进行碎片预测提高肽谱匹配。

J Proteome Res. 2019 Jun 7;18(6):2385-2396. doi: 10.1021/acs.jproteome.8b00499. Epub 2019 May 22.

Extending the coverage of spectral libraries: a neighbor-based approach to predicting intensities of peptide fragmentation spectra.扩展光谱库的覆盖范围：一种基于邻近关系预测肽段碎裂谱强度的方法。

Proteomics. 2013 Mar;13(5):756-65. doi: 10.1002/pmic.201100670. Epub 2013 Feb 4.

Enhanced peptide quantification using spectral count clustering and cluster abundance.使用谱计数聚类和聚类丰度进行增强的肽定量。

BMC Bioinformatics. 2011 Oct 28;12:423. doi: 10.1186/1471-2105-12-423.

Quality assessments of peptide-spectrum matches in shotgun proteomics.肽谱匹配在鸟枪法蛋白质组学中的质量评估。

Proteomics. 2011 Mar;11(6):1086-93. doi: 10.1002/pmic.201000432. Epub 2011 Feb 7.

The generating function approach for Peptide identification in spectral networks.光谱网络中肽段鉴定的生成函数方法。

J Comput Biol. 2015 May;22(5):353-66. doi: 10.1089/cmb.2014.0165. Epub 2014 Nov 25.

Spectral Library Search Improves Assignment of TMT Labeled MS/MS Spectra.光谱库检索可提高 TMT 标记 MS/MS 谱的分配。

J Proteome Res. 2018 Sep 7;17(9):3325-3331. doi: 10.1021/acs.jproteome.8b00594. Epub 2018 Aug 16.

Building and searching tandem mass spectral libraries for peptide identification.构建和搜索串联质谱文库以进行肽鉴定。

Mol Cell Proteomics. 2011 Dec;10(12):R111.008565. doi: 10.1074/mcp.R111.008565. Epub 2011 Sep 6.

Context-sensitive markov models for peptide scoring and identification from tandem mass spectrometry.基于上下文敏感马尔可夫模型的串联质谱肽段打分与鉴定

OMICS. 2013 Feb;17(2):94-105. doi: 10.1089/omi.2012.0073. Epub 2013 Jan 5.

A simulated MS/MS library for spectrum-to-spectrum searching in large scale identification of proteins.一个用于蛋白质大规模鉴定中谱图对谱图搜索的模拟串联质谱库。

Mol Cell Proteomics. 2009 Apr;8(4):857-69. doi: 10.1074/mcp.M800384-MCP200. Epub 2008 Dec 22.

引用本文的文献

Proteomics with Enhanced In-Source Fragmentation/Annotation: Applying XCMS-EISA Informatics and Q-MRM High-Sensitivity Quantification.增强源内碎裂/注释的蛋白质组学：应用 XCMS-EISA 信息学和 Q-MRM 高灵敏度定量分析。

J Am Soc Mass Spectrom. 2021 Nov 3;32(11):2644-2654. doi: 10.1021/jasms.1c00188. Epub 2021 Oct 11.

Methods for Proteogenomics Data Analysis, Challenges, and Scalability Bottlenecks: A Survey.蛋白质基因组学数据分析方法、挑战及可扩展性瓶颈：一项综述。

IEEE Access. 2021;9:5497-5516. doi: 10.1109/ACCESS.2020.3047588. Epub 2020 Dec 25.

Machine learning reveals sex-specific 17β-estradiol-responsive expression patterns in white perch (Morone americana) plasma proteins.机器学习揭示了白鲈（美洲条纹鲈）血浆蛋白中特定性别的17β-雌二醇反应性表达模式。

Proteomics. 2015 Aug;15(15):2678-90. doi: 10.1002/pmic.201400606. Epub 2015 Jun 11.

Fast parallel tandem mass spectral library searching using GPU hardware acceleration.利用 GPU 硬件加速进行快速并行串联质谱文库搜索。

J Proteome Res. 2011 Jun 3;10(6):2882-8. doi: 10.1021/pr200074h. Epub 2011 May 5.

Open MS/MS spectral library search to identify unanticipated post-translational modifications and increase spectral identification rate.开放 MS/MS 谱库检索以鉴定意料外的翻译后修饰，并提高谱图鉴定率。

Bioinformatics. 2010 Jun 15;26(12):i399-406. doi: 10.1093/bioinformatics/btq185.

本文引用的文献

Estimating probabilities of correct identification from results of mass spectral library searches.从质谱文库检索结果估计正确识别的概率。

J Am Soc Mass Spectrom. 1994 Apr;5(4):316-23. doi: 10.1016/1044-0305(94)85022-4.

Optimization and testing of mass spectral library search algorithms for compound identification.化合物鉴定的质谱文库搜索算法的优化和测试。

J Am Soc Mass Spectrom. 1994 Sep;5(9):859-66. doi: 10.1016/1044-0305(94)87009-8.

Chemical substructure identification by mass spectral library searching.通过质谱库检索进行化学子结构鉴定。

J Am Soc Mass Spectrom. 1995 Aug;6(8):644-55. doi: 10.1016/1044-0305(95)00291-K.

Search of sequence databases with uninterpreted high-energy collision-induced dissociation spectra of peptides.肽的未经解释的高能碰撞诱导解离光谱的序列数据库搜索。

J Am Soc Mass Spectrom. 1996 Nov;7(11):1089-98. doi: 10.1016/S1044-0305(96)00079-7.

Development and validation of a spectral library searching method for peptide identification from MS/MS.用于从串联质谱（MS/MS）中鉴定肽段的光谱库搜索方法的开发与验证。

Proteomics. 2007 Mar;7(5):655-67. doi: 10.1002/pmic.200600625.

Analysis of peptide MS/MS spectra from large-scale proteomics experiments using spectrum libraries.使用谱库对大规模蛋白质组学实验中的肽段MS/MS谱进行分析。

Anal Chem. 2006 Aug 15;78(16):5678-84. doi: 10.1021/ac060279n.

Using annotated peptide mass spectrum libraries for protein identification.使用带注释的肽质谱库进行蛋白质鉴定。

J Proteome Res. 2006 Aug;5(8):1843-9. doi: 10.1021/pr0602085.

PepHMM: a hidden Markov model based scoring function for mass spectrometry database search.PepHMM：一种基于隐马尔可夫模型的质谱数据库搜索评分函数。

Anal Chem. 2006 Jan 15;78(2):432-7. doi: 10.1021/ac051319a.

The PeptideAtlas project.肽图数据库项目。

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D655-8. doi: 10.1093/nar/gkj040.

Pfam: clans, web tools and services.蛋白质家族数据库（Pfam）：家族分类、网络工具及服务

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D247-51. doi: 10.1093/nar/gkj149.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。