质谱计算方法的测试与验证

Testing and Validation of Computational Methods for Mass Spectrometry.

作者信息

Gatto Laurent, Hansen Kasper D, Hoopmann Michael R, Hermjakob Henning, Kohlbacher Oliver, Beyer Andreas

机构信息

Computational Proteomics Unit and Cambridge Centre for Proteomics, University of Cambridge , Cambridge CB2 1QR, United Kingdom.

Department of Biostatistics, Johns Hopkins University , Baltimore, Maryland 21205, United States.

出版信息

J Proteome Res. 2016 Mar 4;15(3):809-14. doi: 10.1021/acs.jproteome.5b00852. Epub 2015 Nov 17.

DOI:10.1021/acs.jproteome.5b00852

PMID:26549429

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4804830/

Abstract

High-throughput methods based on mass spectrometry (proteomics, metabolomics, lipidomics, etc.) produce a wealth of data that cannot be analyzed without computational methods. The impact of the choice of method on the overall result of a biological study is often underappreciated, but different methods can result in very different biological findings. It is thus essential to evaluate and compare the correctness and relative performance of computational methods. The volume of the data as well as the complexity of the algorithms render unbiased comparisons challenging. This paper discusses some problems and challenges in testing and validation of computational methods. We discuss the different types of data (simulated and experimental validation data) as well as different metrics to compare methods. We also introduce a new public repository for mass spectrometric reference data sets ( http://compms.org/RefData ) that contains a collection of publicly available data sets for performance evaluation for a wide range of different methods.

摘要

基于质谱的高通量方法（蛋白质组学、代谢组学、脂质组学等）会产生大量数据，如果没有计算方法就无法进行分析。方法的选择对生物学研究整体结果的影响常常未得到充分重视，但不同方法可能会得出截然不同的生物学发现。因此，评估和比较计算方法的正确性及相对性能至关重要。数据量以及算法的复杂性使得无偏比较具有挑战性。本文讨论了计算方法测试与验证中的一些问题和挑战。我们讨论了不同类型的数据（模拟和实验验证数据）以及比较方法的不同指标。我们还介绍了一个新的质谱参考数据集公共存储库（http://compms.org/RefData），其中包含一系列公开可用的数据集，用于评估各种不同方法的性能。

相似文献

Testing and Validation of Computational Methods for Mass Spectrometry.质谱计算方法的测试与验证

J Proteome Res. 2016 Mar 4;15(3):809-14. doi: 10.1021/acs.jproteome.5b00852. Epub 2015 Nov 17.

Computational quality control tools for mass spectrometry proteomics.用于质谱蛋白质组学的计算质量控制工具。

Proteomics. 2017 Feb;17(3-4). doi: 10.1002/pmic.201600159. Epub 2016 Oct 17.

Feature selection and nearest centroid classification for protein mass spectrometry.蛋白质质谱的特征选择与最近质心分类

BMC Bioinformatics. 2005 Mar 23;6:68. doi: 10.1186/1471-2105-6-68.

Bioinformatics challenges in mass spectrometry-driven proteomics.质谱驱动蛋白质组学中的生物信息学挑战。

Methods Mol Biol. 2011;753:359-71. doi: 10.1007/978-1-61779-148-2_24.

TOPPView: an open-source viewer for mass spectrometry data.TOPPView：一款用于质谱数据的开源查看器。

J Proteome Res. 2009 Jul;8(7):3760-3. doi: 10.1021/pr900171m.

In-depth analysis of protein inference algorithms using multiple search engines and well-defined metrics.使用多个搜索引擎和明确的指标对蛋白质推断算法进行深入分析。

J Proteomics. 2017 Jan 6;150:170-182. doi: 10.1016/j.jprot.2016.08.002. Epub 2016 Aug 4.

The Perseus computational platform for comprehensive analysis of (prote)omics data.Perseus 计算平台，用于全面分析（蛋白质组学）数据。

Nat Methods. 2016 Sep;13(9):731-40. doi: 10.1038/nmeth.3901. Epub 2016 Jun 27.

Bioinformatic challenges in targeted proteomics.靶向蛋白质组学中的生物信息学挑战。

J Proteome Res. 2012 Sep 7;11(9):4393-402. doi: 10.1021/pr300276f. Epub 2012 Aug 23.

What computational non-targeted mass spectrometry-based metabolomics can gain from shotgun proteomics.基于计算非靶向质谱的代谢组学能从鸟枪法蛋白质组学中获得什么。

Curr Opin Biotechnol. 2017 Feb;43:141-146. doi: 10.1016/j.copbio.2016.11.014. Epub 2016 Dec 28.

Proteomics, lipidomics, metabolomics: a mass spectrometry tutorial from a computer scientist's point of view.蛋白质组学、脂质组学、代谢组学：从计算机科学家的角度来看的质谱教程。

BMC Bioinformatics. 2014;15 Suppl 7(Suppl 7):S9. doi: 10.1186/1471-2105-15-S7-S9. Epub 2014 May 28.

引用本文的文献

Proteomic Profiling Towards a Better Understanding of Genetic Based Muscular Diseases: The Current Picture and a Look to the Future.蛋白质组学分析助力深入理解遗传性肌肉疾病：现状与展望

Biomolecules. 2025 Jan 15;15(1):130. doi: 10.3390/biom15010130.

Expert Algorithm for Substance Identification Using Mass Spectrometry: Statistical Foundations in Unimolecular Reaction Rate Theory.利用质谱进行物质鉴定的专家算法：单分子反应速率理论中的统计基础。

J Am Soc Mass Spectrom. 2023 Jul 5;34(7):1248-1262. doi: 10.1021/jasms.3c00089. Epub 2023 May 31.

Expert Algorithm for Substance Identification Using Mass Spectrometry: Application to the Identification of Cocaine on Different Instruments Using Binary Classification Models.使用质谱进行物质鉴定的专家算法：在使用二进制分类模型对不同仪器上的可卡因进行鉴定中的应用。

J Am Soc Mass Spectrom. 2023 Jul 5;34(7):1235-1247. doi: 10.1021/jasms.3c00090. Epub 2023 May 31.

Triqler for Protein Summarization of Data from Data-Independent Acquisition Mass Spectrometry.Triqler 用于对数据非依赖性采集质谱数据进行蛋白质总结。

J Proteome Res. 2023 Apr 7;22(4):1359-1366. doi: 10.1021/acs.jproteome.2c00607. Epub 2023 Mar 29.

Simulation of mass spectrometry-based proteomics data with Synthedia.使用Synthedia对基于质谱的蛋白质组学数据进行模拟。

Bioinform Adv. 2022 Dec 19;3(1):vbac096. doi: 10.1093/bioadv/vbac096. eCollection 2023.

A Systematic Review of Metabolomic and Lipidomic Candidates for Biomarkers in Radiation Injury.辐射损伤生物标志物的代谢组学和脂质组学候选物的系统评价

Metabolites. 2020 Jun 20;10(6):259. doi: 10.3390/metabo10060259.

Simultaneous Improvement in the Precision, Accuracy, and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains.通过优化数据处理链，同时提高无标记蛋白质组定量的精密度、准确性和稳健性。

Mol Cell Proteomics. 2019 Aug;18(8):1683-1699. doi: 10.1074/mcp.RA118.001169. Epub 2019 May 16.

ANPELA: analysis and performance assessment of the label-free quantification workflow for metaproteomic studies.ANPELA：用于代谢蛋白质组学研究的无标记定量工作流程的分析和性能评估。

Brief Bioinform. 2020 Mar 23;21(2):621-636. doi: 10.1093/bib/bby127.

pmartR: Quality Control and Statistics for Mass Spectrometry-Based Biological Data.pmartR：基于质谱的生物学数据的质量控制和统计。

J Proteome Res. 2019 Mar 1;18(3):1418-1425. doi: 10.1021/acs.jproteome.8b00760. Epub 2019 Jan 28.

Robust determination of differential abundance in shotgun proteomics using nonparametric statistics.基于非参数统计的 shotgun 蛋白质组学中差异丰度的稳健确定。

Mol Omics. 2018 Dec 3;14(6):424-436. doi: 10.1039/c8mo00077h.

本文引用的文献

Learning from Heterogeneous Data Sources: An Application in Spatial Proteomics.从异构数据源学习：空间蛋白质组学中的应用

PLoS Comput Biol. 2016 May 13;12(5):e1004920. doi: 10.1371/journal.pcbi.1004920. eCollection 2016 May.

Solution to Statistical Challenges in Proteomics Is More Statistics, Not Less.蛋白质组学中统计挑战的解决方案是更多的统计学方法，而非更少。

J Proteome Res. 2015 Oct 2;14(10):4099-103. doi: 10.1021/acs.jproteome.5b00568. Epub 2015 Aug 28.

Comparative assessment of methods for the computational inference of transcript isoform abundance from RNA-seq data.从RNA测序数据计算推断转录本异构体丰度方法的比较评估

Genome Biol. 2015 Jul 23;16(1):150. doi: 10.1186/s13059-015-0702-5.

EBprot: Statistical analysis of labeling-based quantitative proteomics data.EBprot：基于标记的定量蛋白质组学数据的统计分析

Proteomics. 2015 Aug;15(15):2580-91. doi: 10.1002/pmic.201400620. Epub 2015 May 28.

Ten simple rules for reducing overoptimistic reporting in methodological computational research.减少方法学计算研究中过度乐观报告的十条简单规则。

PLoS Comput Biol. 2015 Apr 23;11(4):e1004191. doi: 10.1371/journal.pcbi.1004191. eCollection 2015 Apr.

Managing expectations when publishing tools and methods for computational proteomics.发布计算蛋白质组学工具和方法时的期望管理。

J Proteome Res. 2015 May 1;14(5):2002-4. doi: 10.1021/pr501318d. Epub 2015 Apr 22.

Immunogenetics. Dynamic profiling of the protein life cycle in response to pathogens.免疫遗传学。病原体刺激下蛋白质生命周期的动态分析。

Science. 2015 Mar 6;347(6226):1259038. doi: 10.1126/science.1259038. Epub 2015 Feb 12.

In-depth evaluation of software tools for data-independent acquisition based label-free quantification.基于数据非依赖采集的无标记定量软件工具的深入评估。

Proteomics. 2015 Sep;15(18):3140-51. doi: 10.1002/pmic.201400396. Epub 2015 Feb 5.

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.使用DESeq2对RNA测序数据的倍数变化和离散度进行适度估计。

Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8.

JAMSS: proteomics mass spectrometry simulation in Java.JAMSS：Java 中的蛋白质组学质谱模拟。

Bioinformatics. 2015 Mar 1;31(5):791-3. doi: 10.1093/bioinformatics/btu729. Epub 2014 Nov 3.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验