胰蛋白酶肽段中未切割位点的预测有助于蛋白质组学中的蛋白质鉴定。

Prediction of missed cleavage sites in tryptic peptides aids protein identification in proteomics.

作者信息

Siepen Jennifer A, Keevil Emma-Jayne, Knight David, Hubbard Simon J

机构信息

Faculty of Life Sciences, University of Manchester, M13 9PT, UK.

出版信息

J Proteome Res. 2007 Jan;6(1):399-408. doi: 10.1021/pr060507u.

DOI:10.1021/pr060507u

PMID:17203985

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2664920/

Abstract

Protein identification via peptide mass fingerprinting (PMF) remains a key component of high-throughput proteomics experiments in post-genomic science. Candidate protein identifications are made using bioinformatic tools from peptide peak lists obtained via mass spectrometry (MS). These algorithms rely on several search parameters, including the number of potential uncut peptide bonds matching the primary specificity of the hydrolytic enzyme used in the experiment. Typically, up to one of these "missed cleavages" are considered by the bioinformatics search tools, usually after digestion of the in silico proteome by trypsin. Using two distinct, nonredundant datasets of peptides identified via PMF and tandem MS, a simple predictive method based on information theory is presented which is able to identify experimentally defined missed cleavages with up to 90% accuracy from amino acid sequence alone. Using this simple protocol, we are able to "mask" candidate protein databases so that confident missed cleavage sites need not be considered for in silico digestion. We show that that this leads to an improvement in database searching, with two different search engines, using the PMF dataset as a test set. In addition, the improved approach is also demonstrated on an independent PMF data set of known proteins that also has corresponding high-quality tandem MS data, validating the protein identifications. This approach has wider applicability for proteomics database searching, and the program for predicting missed cleavages and masking Fasta-formatted protein sequence databases has been made available via http:// ispider.smith.man.ac uk/MissedCleave.

摘要

通过肽质量指纹图谱（PMF）进行蛋白质鉴定仍然是后基因组科学高通量蛋白质组学实验的关键组成部分。候选蛋白质鉴定是使用生物信息学工具，根据通过质谱（MS）获得的肽峰列表进行的。这些算法依赖于几个搜索参数，包括与实验中使用的水解酶主要特异性相匹配的潜在未切割肽键数量。通常，生物信息学搜索工具会考虑多达一个这样的“漏切”情况，通常是在胰蛋白酶对虚拟蛋白质组进行消化之后。利用通过PMF和串联MS鉴定的两个不同的、非冗余的肽数据集，提出了一种基于信息论的简单预测方法，该方法仅根据氨基酸序列就能以高达90%的准确率识别实验确定的漏切情况。使用这个简单的方案，我们能够“屏蔽”候选蛋白质数据库，以便在虚拟消化时无需考虑可靠的漏切位点。我们表明，这会改善数据库搜索，使用两个不同的搜索引擎，以PMF数据集作为测试集。此外，在一个已知蛋白质的独立PMF数据集上也展示了改进的方法，该数据集也有相应的高质量串联MS数据，从而验证了蛋白质鉴定。这种方法在蛋白质组学数据库搜索中有更广泛的适用性，并且通过http://ispider.smith.man.ac.uk/MissedCleave提供了预测漏切和屏蔽Fasta格式蛋白质序列数据库的程序。

相似文献

Prediction of missed cleavage sites in tryptic peptides aids protein identification in proteomics.

J Proteome Res. 2007 Jan;6(1):399-408. doi: 10.1021/pr060507u.

VEMS 3.0: algorithms and computational tools for tandem mass spectrometry based identification of post-translational modifications in proteins.

J Proteome Res. 2005 Nov-Dec;4(6):2338-47. doi: 10.1021/pr050264q.

In-depth analysis of protein inference algorithms using multiple search engines and well-defined metrics.

J Proteomics. 2017 Jan 6;150:170-182. doi: 10.1016/j.jprot.2016.08.002. Epub 2016 Aug 4.

In Silico Peptide Repertoire of Human Olfactory Receptor Proteomes on High-Stringency Mass Spectrometry.

J Proteome Res. 2019 Dec 6;18(12):4117-4123. doi: 10.1021/acs.jproteome.8b00494. Epub 2019 May 22.

Investigation and Highly Accurate Prediction of Missed Tryptic Cleavages by Deep Learning.

J Proteome Res. 2021 Jul 2;20(7):3749-3757. doi: 10.1021/acs.jproteome.1c00346. Epub 2021 Jun 17.

Added value for tandem mass spectrometry shotgun proteomics data validation through isoelectric focusing of peptides.

J Proteome Res. 2005 Nov-Dec;4(6):2273-82. doi: 10.1021/pr050193v.

Interrogation of MS/MS search data with an pI Filter algorithm to increase protein identification success.

Electrophoresis. 2007 Jun;28(12):1867-74. doi: 10.1002/elps.200700022.

Prediction of missed proteolytic cleavages for the selection of surrogate peptides for quantitative proteomics.

OMICS. 2012 Sep;16(9):449-56. doi: 10.1089/omi.2011.0156. Epub 2012 Jul 17.

i-Tracker: for quantitative proteomics using iTRAQ.

BMC Genomics. 2005 Oct 20;6:145. doi: 10.1186/1471-2164-6-145.

Controlling nonspecific trypsin cleavages in LC-MS/MS-based shotgun proteomics using optimized experimental conditions.

Analyst. 2015 Nov 21;140(22):7613-21. doi: 10.1039/c5an01505g.

引用本文的文献

molecular analysis and blocking of the viral G protein of Nipah virus interacting with ephrin B2 and B3 receptor by using peptide mass fingerprinting.

Front Bioinform. 2025 Jun 25;5:1526566. doi: 10.3389/fbinf.2025.1526566. eCollection 2025.

Peptide Analysis by Soft X-ray Atmospheric Pressure Photoionization Mass Spectrometry.

J Am Soc Mass Spectrom. 2025 Jun 4;36(6):1286-1295. doi: 10.1021/jasms.5c00037. Epub 2025 May 19.

Development of a Proteomic Workflow for the Identification of Heparan Sulphate Proteoglycan-Binding Substrates of ADAM17.

Proteomics. 2024 Dec;24(23-24):e202400076. doi: 10.1002/pmic.202400076. Epub 2024 Sep 24.

Noncanonical inheritance of phenotypic information by protein amyloids.

Nat Cell Biol. 2024 Oct;26(10):1712-1724. doi: 10.1038/s41556-024-01494-9. Epub 2024 Sep 2.

Optimal conditions for carrying out trypsin digestions on complex proteomes: From bulk samples to single cells.

J Proteomics. 2024 Apr 15;297:105109. doi: 10.1016/j.jprot.2024.105109. Epub 2024 Feb 5.

DbyDeep: Exploration of MS-Detectable Peptides via Deep Learning.

Anal Chem. 2023 Aug 1;95(30):11193-11200. doi: 10.1021/acs.analchem.3c00460. Epub 2023 Jul 17.

Detergent-Assisted Protein Digestion-On the Way to Avoid the Key Bottleneck of Shotgun Bottom-Up Proteomics.

Int J Mol Sci. 2022 Nov 11;23(22):13903. doi: 10.3390/ijms232213903.

Protein Digestion for 2D-DIGE Analysis.

Methods Mol Biol. 2023;2596:339-349. doi: 10.1007/978-1-0716-2831-7_23.

Influence of Asp Isomerization on Trypsin and Trypsin-like Proteolysis.

Anal Chem. 2022 Nov 8;94(44):15288-15296. doi: 10.1021/acs.analchem.2c02585. Epub 2022 Oct 24.

Bioactive Peptides from Algae: Traditional and Novel Generation Strategies, Structure-Function Relationships, and Bioinformatics as Predictive Tools for Bioactivity.

Mar Drugs. 2022 May 10;20(5):317. doi: 10.3390/md20050317.

本文引用的文献

An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.

J Am Soc Mass Spectrom. 1994 Nov;5(11):976-89. doi: 10.1016/1044-0305(94)80016-2.

Universal metrics for quality assessment of protein identifications by mass spectrometry.

Mol Cell Proteomics. 2006 Jul;5(7):1205-11. doi: 10.1074/mcp.M500426-MCP200. Epub 2006 Mar 27.

Optimizing search conditions for the mass fingerprint-based identification of proteins.

Proteomics. 2006 Apr;6(7):2079-85. doi: 10.1002/pmic.200500484.

Improving sensitivity in shotgun proteomics using a peptide-centric database with reduced complexity: protease cleavage and SCX elution rules from data mining of MS/MS spectra.

Anal Chem. 2006 Feb 15;78(4):1071-84. doi: 10.1021/ac051127f.

Proteome survey reveals modularity of the yeast cell machinery.

Nature. 2006 Mar 30;440(7084):631-6. doi: 10.1038/nature04532. Epub 2006 Jan 22.

PepSeeker: a database of proteome peptide identifications for investigating fragmentation patterns.

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D649-54. doi: 10.1093/nar/gkj066.

Algorithm for accurate similarity measurements of peptide mass fingerprints and its application.

J Am Soc Mass Spectrom. 2005 Jan;16(1):13-21. doi: 10.1016/j.jasms.2004.09.013.

Rapid identification of proteins by peptide-mass fingerprinting.

Curr Biol. 1993 Jun 1;3(6):327-32. doi: 10.1016/0960-9822(93)90195-t.

Modular, scriptable and automated analysis tools for high-throughput peptide mass fingerprinting.

Bioinformatics. 2004 Dec 12;20(18):3628-35. doi: 10.1093/bioinformatics/bth460. Epub 2004 Aug 5.

Trypsin cleaves exclusively C-terminal to arginine and lysine residues.

Mol Cell Proteomics. 2004 Jun;3(6):608-14. doi: 10.1074/mcp.T400003-MCP200. Epub 2004 Mar 19.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

胰蛋白酶肽段中未切割位点的预测有助于蛋白质组学中的蛋白质鉴定。

Prediction of missed cleavage sites in tryptic peptides aids protein identification in proteomics.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献