Suppr超能文献

一种通过串联质谱法鉴定蛋白质的统计模型。

A statistical model for identifying proteins by tandem mass spectrometry.

作者信息

Nesvizhskii Alexey I, Keller Andrew, Kolker Eugene, Aebersold Ruedi

机构信息

Institute for Systems Biology, 1441 North 34th Street, Seattle, Washington 98103, USA.

出版信息

Anal Chem. 2003 Sep 1;75(17):4646-58. doi: 10.1021/ac0341261.

Abstract

A statistical model is presented for computing probabilities that proteins are present in a sample on the basis of peptides assigned to tandem mass (MS/MS) spectra acquired from a proteolytic digest of the sample. Peptides that correspond to more than a single protein in the sequence database are apportioned among all corresponding proteins, and a minimal protein list sufficient to account for the observed peptide assignments is derived using the expectation-maximization algorithm. Using peptide assignments to spectra generated from a sample of 18 purified proteins, as well as complex H. influenzae and Halobacterium samples, the model is shown to produce probabilities that are accurate and have high power to discriminate correct from incorrect protein identifications. This method allows filtering of large-scale proteomics data sets with predictable sensitivity and false positive identification error rates. Fast, consistent, and transparent, it provides a standard for publishing large-scale protein identification data sets in the literature and for comparing the results obtained from different experiments.

摘要

提出了一种统计模型,用于根据从样品的蛋白水解消化物中获得的串联质谱(MS/MS)谱所分配的肽段,计算样品中蛋白质存在的概率。在序列数据库中对应多个蛋白质的肽段会在所有相应蛋白质之间进行分配,并使用期望最大化算法得出足以解释观察到的肽段分配的最小蛋白质列表。使用从18种纯化蛋白质样品以及复杂的流感嗜血杆菌和嗜盐菌样品生成的谱图的肽段分配,该模型显示出能产生准确的概率,并且具有很高的能力来区分正确与错误的蛋白质鉴定。此方法允许以可预测的灵敏度和假阳性鉴定错误率对大规模蛋白质组学数据集进行筛选。该方法快速、一致且透明,为在文献中发表大规模蛋白质鉴定数据集以及比较不同实验获得的结果提供了标准。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验