Suppr超能文献

一种用于模拟同源性以用于蛋白质推断算法特征分析的蛋白质标准。

A Protein Standard That Emulates Homology for the Characterization of Protein Inference Algorithms.

机构信息

Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health , KTH - Royal Institute of Technology , Box 1031 , 17121 Solna , Sweden.

European Molecular Biology Laboratory , European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus , Hinxton, Cambridge CB10 1SD , United Kingdom.

出版信息

J Proteome Res. 2018 May 4;17(5):1879-1886. doi: 10.1021/acs.jproteome.7b00899. Epub 2018 Apr 16.

Abstract

A natural way to benchmark the performance of an analytical experimental setup is to use samples of known composition and see to what degree one can correctly infer the content of such a sample from the data. For shotgun proteomics, one of the inherent problems of interpreting data is that the measured analytes are peptides and not the actual proteins themselves. As some proteins share proteolytic peptides, there might be more than one possible causative set of proteins resulting in a given set of peptides and there is a need for mechanisms that infer proteins from lists of detected peptides. A weakness of commercially available samples of known content is that they consist of proteins that are deliberately selected for producing tryptic peptides that are unique to a single protein. Unfortunately, such samples do not expose any complications in protein inference. Hence, for a realistic benchmark of protein inference procedures, there is a need for samples of known content where the present proteins share peptides with known absent proteins. Here, we present such a standard, that is based on E. coli expressed human protein fragments. To illustrate the application of this standard, we benchmark a set of different protein inference procedures on the data. We observe that inference procedures excluding shared peptides provide more accurate estimates of errors compared to methods that include information from shared peptides, while still giving a reasonable performance in terms of the number of identified proteins. We also demonstrate that using a sample of known protein content without proteins with shared tryptic peptides can give a false sense of accuracy for many protein inference methods.

摘要

一种评估分析实验设置性能的自然方法是使用已知成分的样品,并观察在多大程度上可以从数据中正确推断出样品的含量。对于鸟枪法蛋白质组学,解释数据的一个固有问题是,测量的分析物是肽,而不是实际的蛋白质本身。由于一些蛋白质具有共同的酶解肽,可能有多个可能的因果蛋白组导致给定的肽集,并且需要从检测到的肽列表中推断出蛋白质的机制。商业上可用的已知内容样本的一个弱点是,它们由故意选择产生独特于单个蛋白质的酶切肽的蛋白质组成。不幸的是,此类样品不会暴露蛋白质推断中的任何复杂情况。因此,对于蛋白质推断程序的实际基准测试,需要具有已知内容的样品,其中现有蛋白质与已知不存在的蛋白质共享肽。在这里,我们提出了这样一个标准,它基于表达人蛋白片段的大肠杆菌。为了说明该标准的应用,我们根据数据对一组不同的蛋白质推断程序进行了基准测试。我们观察到,排除共享肽的推断程序与包括共享肽信息的方法相比,提供了更准确的错误估计,同时在鉴定的蛋白质数量方面仍具有合理的性能。我们还证明,使用没有共享酶切肽的已知蛋白质含量的样品会使许多蛋白质推断方法产生错误的准确性感觉。

相似文献

3
Protein Inference Using PIA Workflows and PSI Standard File Formats.使用 PIA 工作流程和 PSI 标准文件格式进行蛋白质推断。
J Proteome Res. 2019 Feb 1;18(2):741-747. doi: 10.1021/acs.jproteome.8b00723. Epub 2018 Dec 5.

引用本文的文献

5
Observations from the Proteomics Bench.蛋白质组学实验台的观察结果。
Proteomes. 2024 Feb 6;12(1):6. doi: 10.3390/proteomes12010006.
8
EPIFANY: A Method for Efficient High-Confidence Protein Inference.EPIFANY:一种高效高可信度蛋白质推断方法。
J Proteome Res. 2020 Mar 6;19(3):1060-1072. doi: 10.1021/acs.jproteome.9b00566. Epub 2020 Feb 13.
10
Beyond mass spectrometry, the next step in proteomics.超越质谱法,蛋白质组学的下一步。
Sci Adv. 2020 Jan 10;6(2):eaax8978. doi: 10.1126/sciadv.aax8978. eCollection 2020 Jan.

本文引用的文献

9
Crux: rapid open source protein tandem mass spectrometry analysis.关键:快速开源蛋白质串联质谱分析
J Proteome Res. 2014 Oct 3;13(10):4488-91. doi: 10.1021/pr500741y. Epub 2014 Sep 9.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验