在大规模群体蛋白质组学中检测差异蛋白质表达

Detecting differential protein expression in large-scale population proteomics.

作者信息

Ryu So Young, Qian Wei-Jun, Camp David G, Smith Richard D, Tompkins Ronald G, Davis Ronald W, Xiao Wenzhong

机构信息

Stanford Genome Technology Center, Stanford University, Stanford, CA 94305, USA, Biological Sciences Division and Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA 99352, USA and Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA Stanford Genome Technology Center, Stanford University, Stanford, CA 94305, USA, Biological Sciences Division and Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA 99352, USA and Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA.

出版信息

Bioinformatics. 2014 Oct;30(19):2741-6. doi: 10.1093/bioinformatics/btu341. Epub 2014 Jun 12.

DOI:10.1093/bioinformatics/btu341

PMID:24928210

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4173009/

Abstract

MOTIVATION

Mass spectrometry (MS)-based high-throughput quantitative proteomics shows great potential in large-scale clinical biomarker studies, identifying and quantifying thousands of proteins in biological samples. However, there are unique challenges in analyzing the quantitative proteomics data. One issue is that the quantification of a given peptide is often missing in a subset of the experiments, especially for less abundant peptides. Another issue is that different MS experiments of the same study have significantly varying numbers of peptides quantified, which can result in more missing peptide abundances in an experiment that has a smaller total number of quantified peptides. To detect as many biomarker proteins as possible, it is necessary to develop bioinformatics methods that appropriately handle these challenges.

RESULTS

We propose a Significance Analysis for Large-scale Proteomics Studies (SALPS) that handles missing peptide intensity values caused by the two mechanisms mentioned above. Our model has a robust performance in both simulated data and proteomics data from a large clinical study. Because varying patients' sample qualities and deviating instrument performances are not avoidable for clinical studies performed over the course of several years, we believe that our approach will be useful to analyze large-scale clinical proteomics data.

AVAILABILITY AND IMPLEMENTATION

R codes for SALPS are available at http://www.stanford.edu/%7eclairesr/software.html.

摘要

动机

基于质谱（MS）的高通量定量蛋白质组学在大规模临床生物标志物研究中显示出巨大潜力，可对生物样品中的数千种蛋白质进行鉴定和定量。然而，在分析定量蛋白质组学数据时存在独特的挑战。一个问题是，在一部分实验中，给定肽段的定量往往缺失，尤其是对于丰度较低的肽段。另一个问题是，同一研究的不同质谱实验中定量的肽段数量差异很大，这可能导致在定量肽段总数较少的实验中出现更多缺失的肽段丰度。为了尽可能多地检测生物标志物蛋白质，有必要开发能够适当应对这些挑战的生物信息学方法。

结果

我们提出了一种用于大规模蛋白质组学研究的显著性分析（SALPS）方法，该方法可处理由上述两种机制导致的缺失肽段强度值。我们的模型在模拟数据和来自一项大型临床研究的蛋白质组学数据中均具有稳健的性能。由于在数年的临床研究中，患者样本质量的差异和仪器性能的偏差是不可避免的，我们相信我们的方法将有助于分析大规模临床蛋白质组学数据。

可用性和实现方式

SALPS的R代码可在http://www.stanford.edu/%7eclairesr/software.html获取。

相似文献

Detecting differential protein expression in large-scale population proteomics.在大规模群体蛋白质组学中检测差异蛋白质表达

Bioinformatics. 2014 Oct;30(19):2741-6. doi: 10.1093/bioinformatics/btu341. Epub 2014 Jun 12.

Large-scale multiplexed quantitative discovery proteomics enabled by the use of an (18)O-labeled "universal" reference sample.通过使用（18）O标记的“通用”参考样品实现的大规模多重定量发现蛋白质组学。

J Proteome Res. 2009 Jan;8(1):290-9. doi: 10.1021/pr800467r.

EBprot: Statistical analysis of labeling-based quantitative proteomics data.EBprot：基于标记的定量蛋白质组学数据的统计分析

Proteomics. 2015 Aug;15(15):2580-91. doi: 10.1002/pmic.201400620. Epub 2015 May 28.

Mascot file parsing and quantification (MFPaQ), a new software to parse, validate, and quantify proteomics data generated by ICAT and SILAC mass spectrometric analyses: application to the proteomics study of membrane proteins from primary human endothelial cells.吉祥物文件解析与定量分析（MFPaQ），一种用于解析、验证和定量由ICAT和SILAC质谱分析产生的蛋白质组学数据的新软件：应用于原代人内皮细胞膜蛋白的蛋白质组学研究。

Mol Cell Proteomics. 2007 Sep;6(9):1621-37. doi: 10.1074/mcp.T600069-MCP200. Epub 2007 May 28.

MS-EmpiRe Utilizes Peptide-level Noise Distributions for Ultra-sensitive Detection of Differentially Expressed Proteins.MS-EmpiRe 利用肽级别的噪声分布进行超灵敏差异表达蛋白检测。

Mol Cell Proteomics. 2019 Sep;18(9):1880-1892. doi: 10.1074/mcp.RA119.001509. Epub 2019 Jun 24.

Qupe--a Rich Internet Application to take a step forward in the analysis of mass spectrometry-based quantitative proteomics experiments.Qupe--一种在基于质谱的定量蛋白质组学实验分析中向前迈进的富互联网应用程序。

Bioinformatics. 2009 Dec 1;25(23):3128-34. doi: 10.1093/bioinformatics/btp568. Epub 2009 Oct 6.

Experimental design and data-analysis in label-free quantitative LC/MS proteomics: A tutorial with MSqRob.无标记定量 LC/MS 蛋白质组学中的实验设计和数据分析：MSqRob 教程。

J Proteomics. 2018 Jan 16;171:23-36. doi: 10.1016/j.jprot.2017.04.004. Epub 2017 Apr 5.

Corra: Computational framework and tools for LC-MS discovery and targeted mass spectrometry-based proteomics.科拉：用于液相色谱-质谱联用发现和基于靶向质谱的蛋白质组学的计算框架及工具。

BMC Bioinformatics. 2008 Dec 16;9:542. doi: 10.1186/1471-2105-9-542.

Normalization of peak intensities in bottom-up MS-based proteomics using singular value decomposition.基于均一化处理的生物质谱蛋白质组学方法中利用奇异值分解进行峰强度归一化。

Bioinformatics. 2009 Oct 1;25(19):2573-80. doi: 10.1093/bioinformatics/btp426. Epub 2009 Jul 14.

Comparative analysis of statistical methods used for detecting differential expression in label-free mass spectrometry proteomics.用于检测无标记质谱蛋白质组学中差异表达的统计方法的比较分析。

J Proteomics. 2015 Nov 3;129:83-92. doi: 10.1016/j.jprot.2015.07.012. Epub 2015 Jul 18.

引用本文的文献

Integrating Multiple Quantitative Proteomic Analyses Using MetaMSD.使用 MetaMSD 进行多种定量蛋白质组学分析的整合。

Methods Mol Biol. 2023;2426:361-374. doi: 10.1007/978-1-0716-1967-4_16.

MetaMSD: meta analysis for mass spectrometry data.MetaMSD：质谱数据的荟萃分析

PeerJ. 2019 Apr 10;7:e6699. doi: 10.7717/peerj.6699. eCollection 2019.

Clinically Relevant Post-Translational Modification Analyses-Maturing Workflows and Bioinformatics Tools.临床相关的翻译后修饰分析——不断成熟的工作流程和生物信息学工具

Int J Mol Sci. 2018 Dec 20;20(1):16. doi: 10.3390/ijms20010016.

Biodegradation of alkaline lignin by L1.L1对碱性木质素的生物降解作用

Biotechnol Biofuels. 2017 Feb 21;10:44. doi: 10.1186/s13068-017-0735-y. eCollection 2017.

Genomics of injury: The Glue Grant experience.损伤的基因组学：胶水基金项目的经验

J Trauma Acute Care Surg. 2015 Apr;78(4):671-86. doi: 10.1097/TA.0000000000000568.

本文引用的文献

Bioinformatics tools to identify and quantify proteins using mass spectrometry data.基于质谱数据鉴定和定量蛋白质的生物信息学工具。

Adv Protein Chem Struct Biol. 2014;94:1-17. doi: 10.1016/B978-0-12-800168-4.00001-9.

Premature activation of the SLX4 complex by Vpr promotes G2/M arrest and escape from innate immune sensing.Vpr 通过提前激活 SLX4 复合物促进 G2/M 期阻滞并逃避先天免疫感应。

Cell. 2014 Jan 16;156(1-2):134-45. doi: 10.1016/j.cell.2013.12.011. Epub 2014 Jan 9.

Next-generation proteomics: towards an integrative view of proteome dynamics.下一代蛋白质组学：走向蛋白质组动态的综合视图。

Nat Rev Genet. 2013 Jan;14(1):35-48. doi: 10.1038/nrg3356. Epub 2012 Dec 4.

A hybrid approach to protein differential expression in mass spectrometry-based proteomics.基于质谱的蛋白质组学中蛋白质差异表达的混合方法。

Bioinformatics. 2012 Jun 15;28(12):1586-91. doi: 10.1093/bioinformatics/bts193. Epub 2012 Apr 19.

A genomic storm in critically injured humans.危重症患者的基因组风暴。

J Exp Med. 2011 Dec 19;208(13):2581-90. doi: 10.1084/jem.20111354. Epub 2011 Nov 21.

A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics.用于在鸟枪法蛋白质组学中鉴定肽和蛋白质的计算方法和错误率估计程序的调查。

J Proteomics. 2010 Oct 10;73(11):2092-123. doi: 10.1016/j.jprot.2010.08.009. Epub 2010 Sep 8.

Plasma proteome response to severe burn injury revealed by 18O-labeled "universal" reference-based quantitative proteomics.18O 标记的“通用”参考定量蛋白质组学揭示严重烧伤后血浆蛋白质组的反应。

J Proteome Res. 2010 Sep 3;9(9):4779-89. doi: 10.1021/pr1005026.

Elafin is a biomarker of graft-versus-host disease of the skin.Elafin 是皮肤移植物抗宿主病的生物标志物。

Sci Transl Med. 2010 Jan 6;2(13):13ra2. doi: 10.1126/scitranslmed.3000406.

Matrix metalloproteinase-8 inactivates macrophage inflammatory protein-1 alpha to reduce acute lung inflammation and injury in mice.基质金属蛋白酶-8 使巨噬细胞炎性蛋白-1α失活，从而减轻小鼠的急性肺炎症和损伤。

J Immunol. 2010 Feb 1;184(3):1575-88. doi: 10.4049/jimmunol.0900290. Epub 2009 Dec 30.

A statistical framework for protein quantitation in bottom-up MS-based proteomics.基于质谱的蛋白质组学中蛋白质定量的统计框架。

Bioinformatics. 2009 Aug 15;25(16):2028-34. doi: 10.1093/bioinformatics/btp362. Epub 2009 Jun 17.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。