使用PRONE对串联质谱标签和无标记蛋白质定量数据中的归一化方法进行系统评估。

Systematic evaluation of normalization approaches in tandem mass tag and label-free protein quantification data using PRONE.

作者信息

Arend Lis, Adamowicz Klaudia, Schmidt Johannes R, Burankova Yuliya, Zolotareva Olga, Tsoy Olga, Pauling Josch K, Kalkhof Stefan, Baumbach Jan, List Markus, Laske Tanja

机构信息

Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich, Maximus-von-Imhof Forum 3, 85354 Freising, Germany.

Institute for Computational Systems Biology, University of Hamburg, Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany.

出版信息

Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf201.

DOI:10.1093/bib/bbaf201

PMID:40336172

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12058466/

Abstract

Despite the significant progress in accuracy and reliability in mass spectrometry technology, as well as the development of strategies based on isotopic labeling or internal standards in recent decades, systematic biases originating from non-biological factors remain a significant challenge in data analysis. In addition, the wide range of available normalization methods renders the choice of a suitable normalization method challenging. We systematically evaluated 17 normalization and 2 batch effect correction methods, originally developed for preprocessing DNA microarray data but widely applied in proteomics, on 6 publicly available spike-in and 3 label-free and tandem mass tag datasets. Opposed to state-of-the-art normalization practice, we found that a reduction in intragroup variation is not directly related to the effectiveness of the normalization methods. Furthermore, our results demonstrated that the methods RobNorm and Normics, specifically developed for proteomics data, in line with LoessF performed consistently well across the spike-in datasets, while EigenMS exhibited a high false-positive rate. Finally, based on experimental data, we show that normalization substantially impacts downstream analyses, and the impact is highly dataset-specific, emphasizing the importance of use-case-specific evaluations for novel proteomics datasets. For this, we developed the PROteomics Normalization Evaluator (PRONE), a unifying R package enabling comparative evaluation of normalization methods, including their impact on downstream analyses, while offering considerable flexibility, acknowledging the lack of universally accepted standards. PRONE is available on Bioconductor with a web application accessible at https://exbio.wzw.tum.de/prone/.

摘要

尽管近几十年来质谱技术在准确性和可靠性方面取得了重大进展，以及基于同位素标记或内标的策略也有所发展，但非生物因素导致的系统偏差在数据分析中仍然是一个重大挑战。此外，大量可用的归一化方法使得选择合适的归一化方法具有挑战性。我们系统地评估了最初为预处理DNA微阵列数据而开发但广泛应用于蛋白质组学的17种归一化方法和2种批次效应校正方法，这些方法应用于6个公开可用的掺入标准品数据集以及3个无标记和串联质谱标签数据集。与当前最先进的归一化实践相反，我们发现组内变异的减少与归一化方法的有效性没有直接关系。此外，我们的结果表明，专门为蛋白质组学数据开发的RobNorm和Normics方法，与LoessF方法一致，在掺入标准品数据集中表现始终良好，而EigenMS显示出较高的假阳性率。最后，基于实验数据，我们表明归一化对下游分析有重大影响，并且这种影响高度依赖于数据集，强调了针对新蛋白质组学数据集进行特定用例评估的重要性。为此，我们开发了蛋白质组学归一化评估器（PRONE），这是一个统一的R包，能够对归一化方法进行比较评估，包括它们对下游分析的影响，同时提供了相当大的灵活性，承认缺乏普遍接受的标准。PRONE可在Bioconductor上获取，其网络应用程序可通过https://exbio.wzw.tum.de/prone/访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd67/12058466/cf41e2a8c80b/bbaf201f1.jpg

相似文献

Systematic evaluation of normalization approaches in tandem mass tag and label-free protein quantification data using PRONE.使用PRONE对串联质谱标签和无标记蛋白质定量数据中的归一化方法进行系统评估。

Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf201.

A systematic evaluation of normalization methods in quantitative label-free proteomics.一种定量无标记蛋白质组学中标准化方法的系统评价。

Brief Bioinform. 2018 Jan 1;19(1):1-11. doi: 10.1093/bib/bbw095.

Normics: Proteomic Normalization by Variance and Data-Inherent Correlation Structure.Normics：基于方差和数据固有相关性结构的蛋白质组学标准化。

Mol Cell Proteomics. 2022 Sep;21(9):100269. doi: 10.1016/j.mcpro.2022.100269. Epub 2022 Jul 16.

DEqMS: A Method for Accurate Variance Estimation in Differential Protein Expression Analysis.DEqMS：一种用于差异蛋白质表达分析中精确方差估计的方法。

Mol Cell Proteomics. 2020 Jun;19(6):1047-1057. doi: 10.1074/mcp.TIR119.001646. Epub 2020 Mar 23.

Normalization of peak intensities in bottom-up MS-based proteomics using singular value decomposition.基于均一化处理的生物质谱蛋白质组学方法中利用奇异值分解进行峰强度归一化。

Bioinformatics. 2009 Oct 1;25(19):2573-80. doi: 10.1093/bioinformatics/btp426. Epub 2009 Jul 14.

NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis.NormalyzerDE：一种用于改善组学表达数据标准化和高灵敏度差异表达分析的在线工具。

J Proteome Res. 2019 Feb 1;18(2):732-740. doi: 10.1021/acs.jproteome.8b00523. Epub 2018 Oct 15.

Clinical biomarker discovery by SWATH-MS based label-free quantitative proteomics: impact of criteria for identification of differentiators and data normalization method.基于 SWATH-MS 的无标记定量蛋白质组学的临床生物标志物发现：区分标准和数据归一化方法的影响。

J Transl Med. 2019 May 31;17(1):184. doi: 10.1186/s12967-019-1937-9.

The application of new software tools to quantitative protein profiling via isotope-coded affinity tag (ICAT) and tandem mass spectrometry: II. Evaluation of tandem mass spectrometry methodologies for large-scale protein analysis, and the application of statistical tools for data analysis and interpretation.通过同位素编码亲和标签（ICAT）和串联质谱进行定量蛋白质谱分析的新软件工具应用：II. 用于大规模蛋白质分析的串联质谱方法评估，以及用于数据分析和解释的统计工具应用

Mol Cell Proteomics. 2003 Jul;2(7):428-42. doi: 10.1074/mcp.M300041-MCP200. Epub 2003 Jun 25.

RobNorm: model-based robust normalization method for labeled quantitative mass spectrometry proteomics data.RobNorm：基于模型的有标记定量质谱蛋白质组学数据稳健归一化方法。

Bioinformatics. 2021 May 5;37(6):815-821. doi: 10.1093/bioinformatics/btaa904.

Using PSEA-Quant for Protein Set Enrichment Analysis of Quantitative Mass Spectrometry-Based Proteomics.使用PSEA-Quant进行基于定量质谱的蛋白质组学的蛋白质集富集分析。

Curr Protoc Bioinformatics. 2016 Mar 24;53:13.28.1-13.28.16. doi: 10.1002/0471250953.bi1328s53.

引用本文的文献

Multiplexed Quantification of First-Trimester Serum Biomarkers in Healthy Pregnancy.健康妊娠早期血清生物标志物的多重定量分析

Int J Mol Sci. 2025 Aug 18;26(16):7970. doi: 10.3390/ijms26167970.

Privacy-preserving multicenter differential protein abundance analysis with FedProt.使用FedProt进行隐私保护的多中心差异蛋白质丰度分析。

Nat Comput Sci. 2025 Aug;5(8):675-688. doi: 10.1038/s43588-025-00832-7. Epub 2025 Jul 11.

本文引用的文献

Optimizing differential expression analysis for proteomics data via high-performing rules and ensemble inference.通过高性能规则和集成推理优化蛋白质组学数据的差异表达分析。

Nat Commun. 2024 May 9;15(1):3922. doi: 10.1038/s41467-024-47899-w.

AlphaPeptStats: an open-source Python package for automated and scalable statistical analysis of mass spectrometry-based proteomics.AlphaPeptStats：一个用于基于质谱的蛋白质组学的自动化和可扩展的统计分析的开源 Python 包。

Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad461.

Tidyproteomics: an open-source R package and data object for quantitative proteomics post analysis and visualization.整洁蛋白质组学：用于定量蛋白质组学后分析和可视化的开源 R 包和数据对象。

BMC Bioinformatics. 2023 Jun 6;24(1):239. doi: 10.1186/s12859-023-05360-7.

Normics: Proteomic Normalization by Variance and Data-Inherent Correlation Structure.Normics：基于方差和数据固有相关性结构的蛋白质组学标准化。

Mol Cell Proteomics. 2022 Sep;21(9):100269. doi: 10.1016/j.mcpro.2022.100269. Epub 2022 Jul 16.

Optimized sample preparation and data analysis for TMT proteomic analysis of cerebrospinal fluid applied to the identification of Alzheimer's disease biomarkers.用于脑脊液TMT蛋白质组学分析以鉴定阿尔茨海默病生物标志物的优化样品制备和数据分析。

Clin Proteomics. 2022 May 14;19(1):13. doi: 10.1186/s12014-022-09354-0.

Benchmarking differential expression, imputation and quantification methods for proteomics data.蛋白质组学数据差异表达、插补和定量方法的基准测试。

Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac138.

Assessing normalization methods in mass spectrometry-based proteome profiling of clinical samples.评估基于质谱的临床样本蛋白质组分析中标准化方法。

Biosystems. 2022 Jun;215-216:104661. doi: 10.1016/j.biosystems.2022.104661. Epub 2022 Mar 2.

Diagnostics and correction of batch effects in large-scale proteomic studies: a tutorial.在大规模蛋白质组学研究中进行批次效应的诊断和校正：教程。

Mol Syst Biol. 2021 Aug;17(8):e10240. doi: 10.15252/msb.202110240.

POMAShiny: A user-friendly web-based workflow for metabolomics and proteomics data analysis.POMAShiny：一个用户友好的基于网络的代谢组学和蛋白质组学数据分析工作流程。

PLoS Comput Biol. 2021 Jul 1;17(7):e1009148. doi: 10.1371/journal.pcbi.1009148. eCollection 2021 Jul.

TMTpro-18plex: The Expanded and Complete Set of TMTpro Reagents for Sample Multiplexing.TMTpro-18plex：TMTpro 试剂的扩展和完整套装，用于样品多重标记。

J Proteome Res. 2021 May 7;20(5):2964-2972. doi: 10.1021/acs.jproteome.1c00168. Epub 2021 Apr 26.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用PRONE对串联质谱标签和无标记蛋白质定量数据中的归一化方法进行系统评估。

Systematic evaluation of normalization approaches in tandem mass tag and label-free protein quantification data using PRONE.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献