使用组合过滤器对无标记定量蛋白质组学中的错误发现率和统计显著性进行评估。

An assessment of false discovery rates and statistical significance in label-free quantitative proteomics with combined filters.

作者信息

Li Qingbo, Roxas Bryan Ap

机构信息

Center for Pharmaceutical Biotechnology, College of Pharmacy, University of Illinois at Chicago, Chicago, IL 60607, USA.

出版信息

BMC Bioinformatics. 2009 Feb 2;10:43. doi: 10.1186/1471-2105-10-43.

DOI:10.1186/1471-2105-10-43

PMID:19187558

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2645366/

Abstract

BACKGROUND

Many studies have provided algorithms or methods to assess a statistical significance in quantitative proteomics when multiple replicates for a protein sample and a LC/MS analysis are available. But, confidence is still lacking in using datasets for a biological interpretation without protein sample replicates. Although a fold-change is a conventional threshold that can be used when there are no sample replicates, it does not provide an assessment of statistical significance such as a false discovery rate (FDR) which is an important indicator of the reliability to identify differentially expressed proteins. In this work, we investigate whether differentially expressed proteins can be detected with a statistical significance from a pair of unlabeled protein samples without replicates and with only duplicate LC/MS injections per sample. A FDR is used to gauge the statistical significance of the differentially expressed proteins.

RESULTS

We have experimented to operate on several parameters to control a FDR, including a fold-change, a statistical test, and a minimum number of permuted significant pairings. Although none of these parameters alone gives a satisfactory control of a FDR, we find that a combination of these parameters provides a very effective means to control a FDR without compromising the sensitivity. The results suggest that it is possible to perform a significance analysis without protein sample replicates. Only duplicate LC/MS injections per sample are needed. We illustrate that differentially expressed proteins can be detected with a FDR between 0 and 15% at a positive rate of 4-16%. The method is evaluated for its sensitivity and specificity by a ROC analysis, and is further validated with a [15N]-labeled internal-standard protein sample and additional unlabeled protein sample replicates.

CONCLUSION

We demonstrate that a statistical significance can be inferred without protein sample replicates in label-free quantitative proteomics. The approach described in this study would be useful in many exploratory experiments where a sample amount or instrument time is limited. Naturally, this method is also suitable for proteomics experiments where multiple sample replicates are available. It is simple, and is complementary to other more sophisticated algorithms that are not designed for dealing with a small number of sample replicates.

摘要

背景

当蛋白质样品有多个重复且可进行液相色谱/质谱（LC/MS）分析时，许多研究已提供了评估定量蛋白质组学中统计显著性的算法或方法。但是，在没有蛋白质样品重复的情况下使用数据集进行生物学解释时，仍缺乏信心。虽然倍数变化是在没有样品重复时可使用的传统阈值，但它并未提供诸如错误发现率（FDR）等统计显著性的评估，而FDR是鉴定差异表达蛋白质可靠性的重要指标。在这项工作中，我们研究了是否可以从一对无重复且每个样品仅进行两次LC/MS进样的未标记蛋白质样品中检测出具有统计显著性的差异表达蛋白质。使用FDR来衡量差异表达蛋白质的统计显著性。

结果

我们已对几个参数进行了实验操作以控制FDR，包括倍数变化、统计检验和最小置换显著配对数。尽管这些参数单独使用时均不能令人满意地控制FDR，但我们发现这些参数的组合提供了一种非常有效的方法来控制FDR，而不会损害灵敏度。结果表明，在没有蛋白质样品重复的情况下进行显著性分析是可能的。每个样品仅需两次LC/MS进样。我们表明，以4%-16%的阳性率可以检测出FDR在0至15%之间的差异表达蛋白质。通过ROC分析评估了该方法的灵敏度和特异性，并用[15N]标记的内标蛋白质样品和额外的未标记蛋白质样品重复进行了进一步验证。

结论

我们证明了在无标记定量蛋白质组学中无需蛋白质样品重复即可推断出统计显著性。本研究中描述的方法在许多样品量或仪器时间有限的探索性实验中将很有用。当然，该方法也适用于有多个样品重复的蛋白质组学实验。它很简单，并且是其他未设计用于处理少量样品重复的更复杂算法的补充。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf9/2645366/257fd5ca46e4/1471-2105-10-43-1.jpg

相似文献

An assessment of false discovery rates and statistical significance in label-free quantitative proteomics with combined filters.

BMC Bioinformatics. 2009 Feb 2;10:43. doi: 10.1186/1471-2105-10-43.

Statistical identification of differentially labeled peptides from liquid chromatography tandem mass spectrometry.

Proteomics. 2007 Oct;7(20):3681-92. doi: 10.1002/pmic.200601034.

Significance analysis of microarray for relative quantitation of LC/MS data in proteomics.

BMC Bioinformatics. 2008 Apr 10;9:187. doi: 10.1186/1471-2105-9-187.

Assigning significance in label-free quantitative proteomics to include single-peptide-hit proteins with low replicates.

Int J Proteomics. 2010 Jan 1;2010. doi: 10.1155/2010/731582.

Comparative analysis of statistical methods used for detecting differential expression in label-free mass spectrometry proteomics.

J Proteomics. 2015 Nov 3;129:83-92. doi: 10.1016/j.jprot.2015.07.012. Epub 2015 Jul 18.

A multi-model statistical approach for proteomic spectral count quantitation.

J Proteomics. 2016 Jul 20;144:23-32. doi: 10.1016/j.jprot.2016.05.032. Epub 2016 May 31.

A peptide-retrieval strategy enables significant improvement of quantitative performance without compromising confidence of identification.

J Proteomics. 2017 Jan 30;152:276-282. doi: 10.1016/j.jprot.2016.11.020. Epub 2016 Nov 27.

Large-scale multiplexed quantitative discovery proteomics enabled by the use of an (18)O-labeled "universal" reference sample.

J Proteome Res. 2009 Jan;8(1):290-9. doi: 10.1021/pr800467r.

Normalization and statistical analysis of quantitative proteomics data generated by metabolic labeling.

Mol Cell Proteomics. 2009 Oct;8(10):2227-42. doi: 10.1074/mcp.M800462-MCP200. Epub 2009 Jul 14.

Experimental Null Method to Guide the Development of Technical Procedures and to Control False-Positive Discovery in Quantitative Proteomics.

J Proteome Res. 2015 Oct 2;14(10):4147-57. doi: 10.1021/acs.jproteome.5b00200. Epub 2015 Sep 1.

引用本文的文献

Terminomics Methodologies and the Completeness of Reductive Dimethylation: A Meta-Analysis of Publicly Available Datasets.

Proteomes. 2019 Mar 29;7(2):11. doi: 10.3390/proteomes7020011.

Proteomic Analysis of Non-depleted Serum Proteins from Bottlenose Dolphins Uncovers a High Vanin-1 Phenotype.

Sci Rep. 2016 Sep 26;6:33879. doi: 10.1038/srep33879.

Experimental Null Method to Guide the Development of Technical Procedures and to Control False-Positive Discovery in Quantitative Proteomics.

J Proteome Res. 2015 Oct 2;14(10):4147-57. doi: 10.1021/acs.jproteome.5b00200. Epub 2015 Sep 1.

Label-free proteomics and systems biology analysis of mycobacterial phagosomes in dendritic cells and macrophages.

J Proteome Res. 2011 May 6;10(5):2425-39. doi: 10.1021/pr101245u. Epub 2011 Mar 30.

Assigning significance in label-free quantitative proteomics to include single-peptide-hit proteins with low replicates.

Int J Proteomics. 2010 Jan 1;2010. doi: 10.1155/2010/731582.

Analysis of phagosomal proteomes: from latex-bead to bacterial phagosomes.

Proteomics. 2010 Nov;10(22):4098-116. doi: 10.1002/pmic.201000210.

Combined statistical analyses of peptide intensities and peptide occurrences improves identification of significant peptides from MS-based proteomics data.

J Proteome Res. 2010 Nov 5;9(11):5748-56. doi: 10.1021/pr1005247. Epub 2010 Oct 8.

Acid stress response of a mycobacterial proteome: insight from a gene ontology analysis.

Int J Clin Exp Med. 2009 Nov 10;2(4):309-28.

A systems biology approach to study the phagosomal proteome modulated by mycobacterial infections.

Int J Clin Exp Med. 2009 Sep 30;2(3):233-47.

本文引用的文献

Protein dynamics in iron-starved Mycobacterium tuberculosis revealed by turnover and abundance measurement using hybrid-linear ion trap-Fourier transform mass spectrometry.

Anal Chem. 2008 Sep 15;80(18):6860-9. doi: 10.1021/ac800288t. Epub 2008 Aug 9.

Significance analysis of spectral count data in label-free shotgun proteomics.

Mol Cell Proteomics. 2008 Dec;7(12):2373-85. doi: 10.1074/mcp.M800203-MCP200. Epub 2008 Jul 20.

Enhancing peptide identification confidence by combining search methods.

J Proteome Res. 2008 Aug;7(8):3102-13. doi: 10.1021/pr700798h. Epub 2008 Jun 18.

Identification of estrogen-responsive proteins in MCF-7 human breast cancer cells using label-free quantitative proteomics.

Proteomics. 2008 May;8(10):1987-2005. doi: 10.1002/pmic.200700901.

Linear discriminant analysis-based estimation of the false discovery rate for phosphopeptide identifications.

J Proteome Res. 2008 Jun;7(6):2195-203. doi: 10.1021/pr070510t. Epub 2008 Apr 19.

Significance analysis of microarray for relative quantitation of LC/MS data in proteomics.

BMC Bioinformatics. 2008 Apr 10;9:187. doi: 10.1186/1471-2105-9-187.

An easy-to-use Decoy Database Builder software tool, implementing different decoy strategies for false discovery rate calculation in automated MS/MS protein identifications.

Proteomics. 2008 Mar;8(6):1129-37. doi: 10.1002/pmic.200701073.

An assessment of software solutions for the analysis of mass spectrometry based quantitative proteomics data.

J Proteome Res. 2008 Jan;7(1):51-61. doi: 10.1021/pr700758r. Epub 2008 Jan 4.

False discovery rates and related statistical concepts in mass spectrometry-based proteomics.

J Proteome Res. 2008 Jan;7(1):47-50. doi: 10.1021/pr700747q. Epub 2007 Dec 8.

Mol Cell Proteomics. 2008 Apr;7(4):631-44. doi: 10.1074/mcp.M700240-MCP200. Epub 2007 Nov 19.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用组合过滤器对无标记定量蛋白质组学中的错误发现率和统计显著性进行评估。

An assessment of false discovery rates and statistical significance in label-free quantitative proteomics with combined filters.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献