Mspire-Simulator：用于创建真实金标准数据的 LC-MS shotgun 蛋白质组学模拟软件。

Mspire-Simulator: LC-MS shotgun proteomic simulator for creating realistic gold standard data.

机构信息

Department of Biochemistry, Brigham Young University , 701 East University Parkway, BNSN C100, Provo, Utah 84602, United States.

出版信息

J Proteome Res. 2013 Dec 6;12(12):5742-9. doi: 10.1021/pr400727e. Epub 2013 Oct 3.

DOI:10.1021/pr400727e

PMID:24090032

Abstract

The most important step in any quantitative proteomic pipeline is feature detection (aka peak picking). However, generating quality hand-annotated data sets to validate the algorithms, especially for lower abundance peaks, is nearly impossible. An alternative for creating gold standard data is to simulate it with features closely mimicking real data. We present Mspire-Simulator, a free, open-source shotgun proteomic simulator that goes beyond previous simulation attempts by generating LC-MS features with realistic m/z and intensity variance along with other noise components. It also includes machine-learned models for retention time and peak intensity prediction and a genetic algorithm to custom fit model parameters for experimental data sets. We show that these methods are applicable to data from three different mass spectrometers, including two fundamentally different types, and show visually and analytically that simulated peaks are nearly indistinguishable from actual data. Researchers can use simulated data to rigorously test quantitation software, and proteomic researchers may benefit from overlaying simulated data on actual data sets.

摘要

在任何定量蛋白质组学管道中，最重要的步骤是特征检测（又名峰提取）。然而，生成质量可手动注释的数据集来验证算法，特别是对于较低丰度的峰，几乎是不可能的。创建黄金标准数据的另一种方法是使用与真实数据非常相似的特征来模拟它。我们介绍了 Mspire-Simulator，这是一个免费的、开源的 shotgun 蛋白质组学模拟器，它通过生成具有真实 m/z 和强度变化以及其他噪声成分的 LC-MS 特征，超越了之前的模拟尝试。它还包括用于保留时间和峰强度预测的机器学习模型，以及用于根据实验数据集自定义拟合模型参数的遗传算法。我们表明，这些方法适用于来自三种不同质谱仪的数据，包括两种完全不同类型的质谱仪，并通过视觉和分析表明，模拟的峰与实际数据几乎无法区分。研究人员可以使用模拟数据来严格测试定量软件，蛋白质组学研究人员可能会受益于将模拟数据叠加在实际数据集上。

相似文献

Mspire-Simulator: LC-MS shotgun proteomic simulator for creating realistic gold standard data.

J Proteome Res. 2013 Dec 6;12(12):5742-9. doi: 10.1021/pr400727e. Epub 2013 Oct 3.

LC-MSsim--a simulation software for liquid chromatography mass spectrometry data.

BMC Bioinformatics. 2008 Oct 8;9:423. doi: 10.1186/1471-2105-9-423.

MSSimulator: Simulation of mass spectrometry data.

J Proteome Res. 2011 Jul 1;10(7):2922-9. doi: 10.1021/pr200155f. Epub 2011 Apr 28.

Quality control metrics for LC-MS feature detection tools demonstrated on Saccharomyces cerevisiae proteomic profiles.

J Proteome Res. 2006 Jul;5(7):1527-34. doi: 10.1021/pr050436j.

MassUntangler: a novel alignment tool for label-free liquid chromatography-mass spectrometry proteomic data.

J Chromatogr A. 2011 Dec 9;1218(49):8859-68. doi: 10.1016/j.chroma.2011.06.062. Epub 2011 Jun 22.

i-RUBY: a novel software for quantitative analysis of highly accurate shotgun-proteomics liquid chromatography/tandem mass spectrometry data obtained without stable-isotope labeling of proteins.

Rapid Commun Mass Spectrom. 2011 Apr 15;25(7):960-8. doi: 10.1002/rcm.4943. Epub 2011 Mar 14.

Global quantitative proteomic profiling through 18O-labeling in combination with MS/MS spectra analysis.

J Proteome Res. 2009 Jul;8(7):3653-65. doi: 10.1021/pr8009098.

An iterative strategy for precursor ion selection for LC-MS/MS based shotgun proteomics.

J Proteome Res. 2009 Jul;8(7):3239-51. doi: 10.1021/pr800835x.

Generic workflow for quality assessment of quantitative label-free LC-MS analysis.

Proteomics. 2011 Mar;11(6):1114-24. doi: 10.1002/pmic.201000493. Epub 2011 Feb 7.

Open-source platform for the analysis of liquid chromatography-mass spectrometry (LC-MS) data.

Methods Mol Biol. 2008;428:369-82. doi: 10.1007/978-1-59745-117-8_19.

引用本文的文献

Simulation of mass spectrometry-based proteomics data with Synthedia.

Bioinform Adv. 2022 Dec 19;3(1):vbac096. doi: 10.1093/bioadv/vbac096. eCollection 2023.

Normalizing and Correcting Variable and Complex LC-MS Metabolomic Data with the R Package pseudoDrift.

Metabolites. 2022 May 12;12(5):435. doi: 10.3390/metabo12050435.

SMITER-A Python Library for the Simulation of LC-MS/MS Experiments.

Genes (Basel). 2021 Mar 11;12(3):396. doi: 10.3390/genes12030396.

In Silico Optimization of Mass Spectrometry Fragmentation Strategies in Metabolomics.

Metabolites. 2019 Oct 9;9(10):219. doi: 10.3390/metabo9100219.

Accelerating Lipidomic Method Development through Simulation.

Anal Chem. 2019 Aug 6;91(15):9698-9706. doi: 10.1021/acs.analchem.9b01234. Epub 2019 Jul 25.

MSAcquisitionSimulator: data-dependent acquisition simulator for LC-MS shotgun proteomics.

Bioinformatics. 2016 Apr 15;32(8):1269-71. doi: 10.1093/bioinformatics/btv745. Epub 2015 Dec 17.

Testing and Validation of Computational Methods for Mass Spectrometry.

J Proteome Res. 2016 Mar 4;15(3):809-14. doi: 10.1021/acs.jproteome.5b00852. Epub 2015 Nov 17.

Proteomics, lipidomics, metabolomics: a mass spectrometry tutorial from a computer scientist's point of view.

BMC Bioinformatics. 2014;15 Suppl 7(Suppl 7):S9. doi: 10.1186/1471-2105-15-S7-S9. Epub 2014 May 28.

Contemporary network proteomics and its requirements.

Biology (Basel). 2013 Dec 20;3(1):22-38. doi: 10.3390/biology3010022.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Mspire-Simulator：用于创建真实金标准数据的 LC-MS shotgun 蛋白质组学模拟软件。

Mspire-Simulator: LC-MS shotgun proteomic simulator for creating realistic gold standard data.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献