使用复杂度降低的以肽段为中心的数据库提高鸟枪法蛋白质组学的灵敏度：基于MS/MS谱数据挖掘的蛋白酶切割和强阳离子交换洗脱规则

Improving sensitivity in shotgun proteomics using a peptide-centric database with reduced complexity: protease cleavage and SCX elution rules from data mining of MS/MS spectra.

作者信息

Yen Chia-Yu, Russell Steve, Mendoza Alex M, Meyer-Arendt Karen, Sun Shaojun, Cios Krzysztof J, Ahn Natalie G, Resing Katheryn A

机构信息

Department of Computer Science and Engineering, University of Colorado at Denver and Health Sciences Center, Denver, CO 80217-3364, USA.

出版信息

Anal Chem. 2006 Feb 15;78(4):1071-84. doi: 10.1021/ac051127f.

DOI:10.1021/ac051127f

PMID:16478097

Abstract

Correct identification of a peptide sequence from MS/MS data is still a challenging research problem, particularly in proteomic analyses of higher eukaryotes where protein databases are large. The scoring methods of search programs often generate cases where incorrect peptide sequences score higher than correct peptide sequences (referred to as distraction). Because smaller databases yield less distraction and better discrimination between correct and incorrect assignments, we developed a method for editing a peptide-centric database (PC-DB) to remove unlikely sequences and strategies for enabling search programs to utilize this peptide database. Rules for unlikely missed cleavage and nontryptic proteolysis products were identified by data mining 11 849 high-confidence peptide assignments. We also evaluated ion exchange chromatographic behavior as an editing criterion to generate subset databases. When used to search a well-annotated test data set of MS/MS spectra, we found no loss of critical information using PC-DBs, validating the methods for generating and searching against the databases. On the other hand, improved confidence in peptide assignments was achieved for tryptic peptides, measured by changes in DeltaCN and RSP. Decreased distraction was also achieved, consistent with the 3-9-fold decrease in database size. Data mining identified a major class of common nonspecific proteolytic products corresponding to leucine aminopeptidase (LAP) cleavages. Large improvements in identifying LAP products were achieved using the PC-DB approach when compared with conventional searches against protein databases. These results demonstrate that peptide properties can be used to reduce database size, yielding improved accuracy and information capture due to reduced distraction, but with little loss of information compared to conventional protein database searches.

摘要

从串联质谱（MS/MS）数据中正确识别肽序列仍然是一个具有挑战性的研究问题，特别是在高等真核生物的蛋白质组分析中，其蛋白质数据库非常庞大。搜索程序的评分方法常常会出现错误的肽序列得分高于正确肽序列的情况（称为干扰）。由于较小的数据库产生的干扰较少，并且在正确和错误分配之间具有更好的区分度，我们开发了一种编辑以肽为中心的数据库（PC-DB）的方法，以去除不太可能的序列，以及使搜索程序能够利用此肽数据库的策略。通过对11849个高可信度肽分配的数据挖掘，确定了不太可能的漏切和非胰蛋白酶解产物的规则。我们还评估了离子交换色谱行为作为生成子集数据库的编辑标准。当用于搜索一个注释良好的MS/MS谱测试数据集时，我们发现使用PC-DB没有关键信息的丢失，验证了生成和搜索数据库的方法。另一方面，通过DeltaCN和RSP的变化来衡量，胰蛋白酶肽的肽分配置信度得到了提高。干扰也有所降低，这与数据库大小减少3至9倍一致。数据挖掘确定了一类主要的常见非特异性蛋白水解产物，对应于亮氨酸氨肽酶（LAP）的切割。与针对蛋白质数据库的传统搜索相比，使用PC-DB方法在识别LAP产物方面有了很大改进。这些结果表明，肽的特性可用于减小数据库大小，由于干扰减少，从而提高准确性和信息捕获能力，但与传统蛋白质数据库搜索相比，信息损失很小。

相似文献

Improving sensitivity in shotgun proteomics using a peptide-centric database with reduced complexity: protease cleavage and SCX elution rules from data mining of MS/MS spectra.使用复杂度降低的以肽段为中心的数据库提高鸟枪法蛋白质组学的灵敏度：基于MS/MS谱数据挖掘的蛋白酶切割和强阳离子交换洗脱规则

Anal Chem. 2006 Feb 15;78(4):1071-84. doi: 10.1021/ac051127f.

Improving peptide identification with single-stage mass spectrum peaks.提高单级质谱峰的肽鉴定能力。

Bioinformatics. 2009 Nov 15;25(22):2969-74. doi: 10.1093/bioinformatics/btp501. Epub 2009 Aug 18.

VEMS 3.0: algorithms and computational tools for tandem mass spectrometry based identification of post-translational modifications in proteins.VEMS 3.0：用于基于串联质谱法鉴定蛋白质翻译后修饰的算法和计算工具

J Proteome Res. 2005 Nov-Dec;4(6):2338-47. doi: 10.1021/pr050264q.

Improving mass and liquid chromatography based identification of proteins using bayesian scoring.使用贝叶斯评分改进基于质谱和液相色谱的蛋白质鉴定

J Proteome Res. 2005 Nov-Dec;4(6):2174-84. doi: 10.1021/pr050251c.

Statistical models for protein validation using tandem mass spectral data and protein amino acid sequence databases.使用串联质谱数据和蛋白质氨基酸序列数据库进行蛋白质验证的统计模型。

Anal Chem. 2004 Mar 15;76(6):1664-71. doi: 10.1021/ac035112y.

Added value for tandem mass spectrometry shotgun proteomics data validation through isoelectric focusing of peptides.通过肽段等电聚焦对串联质谱鸟枪法蛋白质组学数据进行验证的附加价值。

J Proteome Res. 2005 Nov-Dec;4(6):2273-82. doi: 10.1021/pr050193v.

Protein identification by tandem mass spectrometry and sequence database searching.通过串联质谱和序列数据库搜索进行蛋白质鉴定。

Methods Mol Biol. 2007;367:87-119. doi: 10.1385/1-59745-275-0:87.

Detection and validation of non-synonymous coding SNPs from orthogonal analysis of shotgun proteomics data.通过鸟枪法蛋白质组学数据的正交分析检测和验证非同义编码单核苷酸多态性

J Proteome Res. 2007 Jun;6(6):2331-40. doi: 10.1021/pr0700908. Epub 2007 May 9.

Support vector machines for improved peptide identification from tandem mass spectrometry database search.用于从串联质谱数据库搜索中改进肽段鉴定的支持向量机

Methods Mol Biol. 2009;492:453-60. doi: 10.1007/978-1-59745-493-3_28.

Integrated approach for manual evaluation of peptides identified by searching protein sequence databases with tandem mass spectra.通过串联质谱搜索蛋白质序列数据库鉴定肽段的手动评估综合方法。

J Proteome Res. 2005 May-Jun;4(3):998-1005. doi: 10.1021/pr049754t.

引用本文的文献

Advances in proteomics: characterization of the innate immune system after birth and during inflammation.蛋白质组学研究进展：出生后和炎症期间固有免疫系统的特征。

Front Immunol. 2023 Oct 6;14:1254948. doi: 10.3389/fimmu.2023.1254948. eCollection 2023.

Comparative gender peptidomics of venoms: are there differences between them?毒液的性别肽组学比较：它们之间存在差异吗？

J Venom Anim Toxins Incl Trop Dis. 2020 Oct 7;26:e20200055. doi: 10.1590/1678-9199-JVATITD-2020-0055.

Quantitative Measurements of LRRK2 in Human Cerebrospinal Fluid Demonstrates Increased Levels in G2019S Patients.人脑脊液中LRRK2的定量测量表明，G2019S患者的水平升高。

Front Neurosci. 2020 May 25;14:526. doi: 10.3389/fnins.2020.00526. eCollection 2020.

SpirPep: an in silico digestion-based platform to assist bioactive peptides discovery from a genome-wide database.SpirPep：一种基于计算机模拟消化的平台，可从全基因组数据库中辅助发现生物活性肽。

BMC Bioinformatics. 2018 Apr 20;19(1):149. doi: 10.1186/s12859-018-2143-0.

The NISTmAb tryptic peptide spectral library for monoclonal antibody characterization.NISTmAb 酶切肽谱库用于单克隆抗体表征。

MAbs. 2018 Apr;10(3):354-369. doi: 10.1080/19420862.2018.1436921. Epub 2018 Mar 6.

Performance comparison of three trypsin columns used in liquid chromatography.液相色谱中使用的三种胰蛋白酶柱的性能比较

J Chromatogr A. 2017 Mar 24;1490:126-132. doi: 10.1016/j.chroma.2017.02.024. Epub 2017 Feb 14.

Quantification of Flavin-containing Monooxygenases 1, 3, and 5 in Human Liver Microsomes by UPLC-MRM-Based Targeted Quantitative Proteomics and Its Application to the Study of Ontogeny.基于超高效液相色谱-多反应监测的靶向定量蛋白质组学对人肝微粒体中含黄素单加氧酶1、3和5的定量分析及其在个体发育研究中的应用

Drug Metab Dispos. 2016 Jul;44(7):975-83. doi: 10.1124/dmd.115.067538. Epub 2016 Feb 2.

Extending the coverage of spectral libraries: a neighbor-based approach to predicting intensities of peptide fragmentation spectra.扩展光谱库的覆盖范围：一种基于邻近关系预测肽段碎裂谱强度的方法。

Proteomics. 2013 Mar;13(5):756-65. doi: 10.1002/pmic.201100670. Epub 2013 Feb 4.

Prediction of missed proteolytic cleavages for the selection of surrogate peptides for quantitative proteomics.预测潜在的蛋白水解切割位点，以选择用于定量蛋白质组学的替代肽。

OMICS. 2012 Sep;16(9):449-56. doi: 10.1089/omi.2011.0156. Epub 2012 Jul 17.

18O-labeled proteome reference as global internal standards for targeted quantification by selected reaction monitoring-mass spectrometry.18O 标记蛋白质组参考物作为靶向定量选择反应监测-质谱法的全局内标。

Mol Cell Proteomics. 2011 Dec;10(12):M110.007302. doi: 10.1074/mcp.M110.007302. Epub 2011 Oct 11.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用复杂度降低的以肽段为中心的数据库提高鸟枪法蛋白质组学的灵敏度：基于MS/MS谱数据挖掘的蛋白酶切割和强阳离子交换洗脱规则

Improving sensitivity in shotgun proteomics using a peptide-centric database with reduced complexity: protease cleavage and SCX elution rules from data mining of MS/MS spectra.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献