• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于半监督模型的质谱蛋白质组学中肽段鉴定的验证

Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics.

作者信息

Choi Hyungwon, Nesvizhskii Alexey I

机构信息

Department of Pathology and Biostatistics, University of Michigan, Ann Arbor, Michigan 48109, USA.

出版信息

J Proteome Res. 2008 Jan;7(1):254-65. doi: 10.1021/pr070542g. Epub 2007 Dec 27.

DOI:10.1021/pr070542g
PMID:18159924
Abstract

Development of robust statistical methods for validation of peptide assignments to tandem mass (MS/MS) spectra obtained using database searching remains an important problem. PeptideProphet is one of the commonly used computational tools available for that purpose. An alternative simple approach for validation of peptide assignments is based on addition of decoy (reversed, randomized, or shuffled) sequences to the searched protein sequence database. The probabilistic modeling approach of PeptideProphet and the decoy strategy can be combined within a single semisupervised framework, leading to improved robustness and higher accuracy of computed probabilities even in the case of most challenging data sets. We present a semisupervised expectation-maximization (EM) algorithm for constructing a Bayes classifier for peptide identification using the probability mixture model, extending PeptideProphet to incorporate decoy peptide matches. Using several data sets of varying complexity, from control protein mixtures to a human plasma sample, and using three commonly used database search programs, SEQUEST, MASCOT, and TANDEM/k-score, we illustrate that more accurate mixture estimation leads to an improved control of the false discovery rate in the classification of peptide assignments.

摘要

开发用于验证通过数据库搜索获得的串联质谱(MS/MS)谱图中肽段匹配的稳健统计方法仍然是一个重要问题。PeptideProphet是用于此目的的常用计算工具之一。一种用于验证肽段匹配的替代简单方法是基于向搜索的蛋白质序列数据库中添加诱饵(反向、随机或重排)序列。PeptideProphet的概率建模方法和诱饵策略可以在单个半监督框架内结合,即使在最具挑战性的数据集情况下,也能提高稳健性并提高计算概率的准确性。我们提出了一种半监督期望最大化(EM)算法,用于使用概率混合模型构建用于肽段鉴定的贝叶斯分类器,扩展PeptideProphet以纳入诱饵肽段匹配。使用从对照蛋白质混合物到人类血浆样本等几个不同复杂程度的数据集,并使用三个常用的数据库搜索程序SEQUEST、MASCOT和TANDEM/k-score,我们表明更准确的混合估计会导致在肽段匹配分类中对错误发现率的更好控制。

相似文献

1
Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics.基于半监督模型的质谱蛋白质组学中肽段鉴定的验证
J Proteome Res. 2008 Jan;7(1):254-65. doi: 10.1021/pr070542g. Epub 2007 Dec 27.
2
Statistical validation of peptide identifications in large-scale proteomics using the target-decoy database search strategy and flexible mixture modeling.使用目标-诱饵数据库搜索策略和灵活混合模型对大规模蛋白质组学中的肽段鉴定进行统计验证。
J Proteome Res. 2008 Jan;7(1):286-92. doi: 10.1021/pr7006818. Epub 2007 Dec 14.
3
Improving sensitivity by probabilistically combining results from multiple MS/MS search methodologies.通过概率性合并多种串联质谱(MS/MS)搜索方法的结果来提高灵敏度。
J Proteome Res. 2008 Jan;7(1):245-53. doi: 10.1021/pr070540w.
4
Added value for tandem mass spectrometry shotgun proteomics data validation through isoelectric focusing of peptides.通过肽段等电聚焦对串联质谱鸟枪法蛋白质组学数据进行验证的附加价值。
J Proteome Res. 2005 Nov-Dec;4(6):2273-82. doi: 10.1021/pr050193v.
5
Probability-based evaluation of peptide and protein identifications from tandem mass spectrometry and SEQUEST analysis: the human proteome.基于概率的串联质谱和SEQUEST分析对肽和蛋白质鉴定的评估:人类蛋白质组
J Proteome Res. 2005 Jan-Feb;4(1):53-62. doi: 10.1021/pr0498638.
6
Oscore: a combined score to reduce false negative rates for peptide identification in tandem mass spectrometry analysis.Oscore:一种用于降低串联质谱分析中肽段鉴定假阴性率的综合评分。
J Mass Spectrom. 2009 Jan;44(1):25-31. doi: 10.1002/jms.1466.
7
Statistical models for protein validation using tandem mass spectral data and protein amino acid sequence databases.使用串联质谱数据和蛋白质氨基酸序列数据库进行蛋白质验证的统计模型。
Anal Chem. 2004 Mar 15;76(6):1664-71. doi: 10.1021/ac035112y.
8
Optimization of Search Engines and Postprocessing Approaches to Maximize Peptide and Protein Identification for High-Resolution Mass Data.优化搜索引擎和后处理方法以最大化高分辨率质谱数据的肽段和蛋白质鉴定
J Proteome Res. 2015 Nov 6;14(11):4662-73. doi: 10.1021/acs.jproteome.5b00536. Epub 2015 Sep 30.
9
Protein identification by tandem mass spectrometry and sequence database searching.通过串联质谱和序列数据库搜索进行蛋白质鉴定。
Methods Mol Biol. 2007;367:87-119. doi: 10.1385/1-59745-275-0:87.
10
Reverse and Random Decoy Methods for False Discovery Rate Estimation in High Mass Accuracy Peptide Spectral Library Searches.反转和随机诱饵方法在高质量精度肽谱库搜索中的假发现率估计。
J Proteome Res. 2018 Feb 2;17(2):846-857. doi: 10.1021/acs.jproteome.7b00614. Epub 2018 Jan 11.

引用本文的文献

1
Comparative Analysis of Data-Driven Rescoring Platforms for Improved Peptide Identification in HeLa Digest Samples.用于改进HeLa消化样品中肽段鉴定的数据驱动重新评分平台的比较分析
Proteomics. 2025 Apr;25(7):e202400225. doi: 10.1002/pmic.202400225. Epub 2025 Feb 2.
2
Network-based elucidation of colon cancer drug resistance mechanisms by phosphoproteomic time-series analysis.基于网络的磷酸化蛋白质组学时间序列分析揭示结肠癌耐药机制。
Nat Commun. 2024 May 9;15(1):3909. doi: 10.1038/s41467-024-47957-3.
3
Detecting diagnostic features in MS/MS spectra of post-translationally modified peptides.
检测经翻译后修饰肽的 MS/MS 谱中的诊断特征。
Nat Commun. 2023 Jul 12;14(1):4132. doi: 10.1038/s41467-023-39828-0.
4
The Classical Apoptotic Adaptor FADD Regulates Glycolytic Capacity in Acute Lymphoblastic Leukemia.经典凋亡衔接蛋白 FADD 调节急性淋巴细胞白血病的糖酵解能力。
Int J Biol Sci. 2022 May 1;18(8):3137-3155. doi: 10.7150/ijbs.68016. eCollection 2022.
5
TIDD: tool-independent and data-dependent machine learning for peptide identification.TIDD:用于肽鉴定的与工具无关且与数据相关的机器学习。
BMC Bioinformatics. 2022 Mar 30;23(1):109. doi: 10.1186/s12859-022-04640-y.
6
Dynamics of huntingtin protein interactions in the striatum identifies candidate modifiers of Huntington disease.纹状体中亨廷顿蛋白相互作用的动力学鉴定亨廷顿病的候选修饰因子。
Cell Syst. 2022 Apr 20;13(4):304-320.e5. doi: 10.1016/j.cels.2022.01.005. Epub 2022 Feb 10.
7
Comparison of false-discovery rates of various decoy databases.各种诱饵数据库的错误发现率比较。
Proteome Sci. 2021 Sep 18;19(1):11. doi: 10.1186/s12953-021-00179-7.
8
Deep learning for peptide identification from metaproteomics datasets.基于深度学习的宏蛋白质组学数据肽段鉴定。
J Proteomics. 2021 Sep 15;247:104316. doi: 10.1016/j.jprot.2021.104316. Epub 2021 Jul 8.
9
IonQuant Enables Accurate and Sensitive Label-Free Quantification With FDR-Controlled Match-Between-Runs.IonQuant 实现了基于 FDR 控制的匹配运行间精确、灵敏的无标记定量分析。
Mol Cell Proteomics. 2021;20:100077. doi: 10.1016/j.mcpro.2021.100077. Epub 2021 Apr 2.
10
A Critical Review of Bottom-Up Proteomics: The Good, the Bad, and the Future of this Field.自下而上蛋白质组学的批判性综述:该领域的优势、不足与未来
Proteomes. 2020 Jul 6;8(3):14. doi: 10.3390/proteomes8030014.