• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种在SEQUEST数据库搜索结果中筛选出肽段假阳性鉴定的新策略。

A new strategy to filter out false positive identifications of peptides in SEQUEST database search results.

作者信息

Zhang Jiyang, Li Jianqi, Xie Hongwei, Zhu Yunping, He Fuchu

机构信息

College of Mechanical and Electronic Engineering and Automatization, National University of Defense Technology, Changsha, China.

出版信息

Proteomics. 2007 Nov;7(22):4036-44. doi: 10.1002/pmic.200600929.

DOI:10.1002/pmic.200600929
PMID:17952874
Abstract

Based on the randomized database method and a linear discriminant function (LDF) model, a new strategy to filter out false positive matches in SEQUEST database search results is proposed. Given an experiment MS/MS dataset and a protein sequence database, a randomized database is constructed and merged with the original database. Then, all MS/MS spectra are searched against the combined database. For each expected false positive rate (FPR), LDFs are constructed for different charge states and used to filter out the false positive matches from the normal database. In order to investigate the error of FPR estimation, the new strategy was applied to a reference dataset. As a result, the estimated FPR was very close to the actual FPR. While applied to a human K562 cell line dataset, which is a complicated dataset from real sample, more matches could be confirmed than the traditional cutoff-based methods at the same estimated FPR. Also, though most of the results confirmed by the LDF model were consistent with those of PeptideProphet, the LDF model could still provide complementary information. These results indicate that the new method can reliably control the FPR of peptide identifications and is more sensitive than traditional cutoff-based methods.

摘要

基于随机数据库方法和线性判别函数(LDF)模型,提出了一种在SEQUEST数据库搜索结果中滤除假阳性匹配的新策略。给定一个实验性的MS/MS数据集和一个蛋白质序列数据库,构建一个随机数据库并与原始数据库合并。然后,针对合并后的数据库搜索所有MS/MS谱图。对于每个预期的假阳性率(FPR),针对不同电荷态构建LDF,并用于从正常数据库中滤除假阳性匹配。为了研究FPR估计的误差,将新策略应用于一个参考数据集。结果,估计的FPR与实际FPR非常接近。当应用于人类K562细胞系数据集(这是一个来自真实样本的复杂数据集)时,在相同的估计FPR下,与传统的基于截断值的方法相比,可以确认更多的匹配。此外,尽管LDF模型确认的大多数结果与PeptideProphet的结果一致,但LDF模型仍可提供补充信息。这些结果表明,新方法可以可靠地控制肽段鉴定的FPR,并且比传统的基于截断值的方法更灵敏。

相似文献

1
A new strategy to filter out false positive identifications of peptides in SEQUEST database search results.一种在SEQUEST数据库搜索结果中筛选出肽段假阳性鉴定的新策略。
Proteomics. 2007 Nov;7(22):4036-44. doi: 10.1002/pmic.200600929.
2
Oscore: a combined score to reduce false negative rates for peptide identification in tandem mass spectrometry analysis.Oscore:一种用于降低串联质谱分析中肽段鉴定假阴性率的综合评分。
J Mass Spectrom. 2009 Jan;44(1):25-31. doi: 10.1002/jms.1466.
3
Probability-based evaluation of peptide and protein identifications from tandem mass spectrometry and SEQUEST analysis: the human proteome.基于概率的串联质谱和SEQUEST分析对肽和蛋白质鉴定的评估:人类蛋白质组
J Proteome Res. 2005 Jan-Feb;4(1):53-62. doi: 10.1021/pr0498638.
4
Support vector machines for improved peptide identification from tandem mass spectrometry database search.用于从串联质谱数据库搜索中改进肽段鉴定的支持向量机
Methods Mol Biol. 2009;492:453-60. doi: 10.1007/978-1-59745-493-3_28.
5
Improving peptide identification using an empirical peptide retention time database.使用经验性肽保留时间数据库改进肽鉴定
Rapid Commun Mass Spectrom. 2009 Jan;23(1):109-18. doi: 10.1002/rcm.3851.
6
Maximizing the sensitivity and reliability of peptide identification in large-scale proteomic experiments by harnessing multiple search engines.利用多个搜索引擎,最大限度地提高大规模蛋白质组学实验中肽鉴定的灵敏度和可靠性。
Proteomics. 2010 Mar;10(6):1172-89. doi: 10.1002/pmic.200900074.
7
Systematic determination of ion score cutoffs based on calculated false positive rates: application for identifying ubiquitinated proteins by tandem mass spectrometry.基于计算出的假阳性率系统确定离子得分阈值:在通过串联质谱法鉴定泛素化蛋白中的应用。
J Mass Spectrom. 2008 Mar;43(3):296-304. doi: 10.1002/jms.1297.
8
A hybrid method for peptide identification using integer linear optimization, local database search, and quadrupole time-of-flight or OrbiTrap tandem mass spectrometry.一种使用整数线性优化、本地数据库搜索以及四极杆飞行时间或轨道阱串联质谱进行肽段鉴定的混合方法。
J Proteome Res. 2008 Apr;7(4):1584-93. doi: 10.1021/pr700577z. Epub 2008 Mar 7.
9
PepHMM: a hidden Markov model based scoring function for mass spectrometry database search.PepHMM:一种基于隐马尔可夫模型的质谱数据库搜索评分函数。
Anal Chem. 2006 Jan 15;78(2):432-7. doi: 10.1021/ac051319a.
10
Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry.查找峰:一种用于串联质谱法蛋白质鉴定的从头测序与数据库搜索相结合的方法。
Anal Chem. 2007 Feb 15;79(4):1393-400. doi: 10.1021/ac0617013. Epub 2007 Jan 23.

引用本文的文献

1
Full-Featured, Real-Time Database Searching Platform Enables Fast and Accurate Multiplexed Quantitative Proteomics.功能全面、实时的数据库搜索平台可实现快速准确的多重定量蛋白质组学分析。
J Proteome Res. 2020 May 1;19(5):2026-2034. doi: 10.1021/acs.jproteome.9b00860. Epub 2020 Apr 6.
2
Performance comparisons of nano-LC systems, electrospray sources and LC-MS-MS platforms.纳升液相色谱系统、电喷雾离子源和液相色谱-串联质谱平台的性能比较。
J Chromatogr Sci. 2014 Feb;52(2):120-7. doi: 10.1093/chromsci/bms255. Epub 2013 Jan 17.
3
Learning from decoys to improve the sensitivity and specificity of proteomics database search results.
从诱饵中学习以提高蛋白质组学数据库搜索结果的灵敏度和特异性。
PLoS One. 2012;7(11):e50651. doi: 10.1371/journal.pone.0050651. Epub 2012 Nov 26.
4
IDPicker 2.0: Improved protein assembly with high discrimination peptide identification filtering.IDPicker 2.0:通过高分辨率肽段鉴定筛选实现蛋白质组装的改进
J Proteome Res. 2009 Aug;8(8):3872-81. doi: 10.1021/pr900360j.
5
Bayesian nonparametric model for the validation of peptide identification in shotgun proteomics.用于鸟枪法蛋白质组学中肽段鉴定验证的贝叶斯非参数模型。
Mol Cell Proteomics. 2009 Mar;8(3):547-57. doi: 10.1074/mcp.M700558-MCP200. Epub 2008 Nov 12.
6
Bioinformatics in China: a personal perspective.中国的生物信息学:个人视角。
PLoS Comput Biol. 2008 Apr 25;4(4):e1000020. doi: 10.1371/journal.pcbi.1000020.
7
A nonparametric model for quality control of database search results in shotgun proteomics.一种用于鸟枪法蛋白质组学数据库搜索结果质量控制的非参数模型。
BMC Bioinformatics. 2008 Jan 21;9:29. doi: 10.1186/1471-2105-9-29.