一个用于蛋白质组学数据处理、结果存储和分析的可扩展的基于网络的平台。

A Scalable, Web-Based Platform for Proteomics Data Processing, Result Storage and Analysis.

作者信息

Schneider Markus, Zolg Daniel P, Samaras Patroklos, Ben Fredj Samia, Bold Dulguun, Guevende Agnes, Hogrebe Alexander, Berger Michelle T, Graber Michael, Sukumar Vishal, Mamisashvili Lizi, Bronsthein Igor, Eljagh Layla, Gessulat Siegfried, Seefried Florian, Schmidt Tobias, Frejno Martin

机构信息

MSAID GmbH, Garching b. München 85748, Germany.

MSAID GmbH, Berlin 13347, Germany.

出版信息

J Proteome Res. 2025 Mar 7;24(3):1241-1249. doi: 10.1021/acs.jproteome.4c00871. Epub 2025 Feb 21.

DOI:10.1021/acs.jproteome.4c00871

PMID:39982847

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11894649/

Abstract

The exponential increase in proteomics data presents critical challenges for conventional processing workflows. These pipelines often consist of fragmented software packages, glued together using complex in-house scripts or error-prone manual workflows running on local hardware, which are costly to maintain and scale. The MSAID Platform offers a fully automated, managed proteomics data pipeline, consolidating formerly disjointed functions into unified, API-driven services that cover the entire process from raw data to biological insights. Backed by the cloud-native search algorithm CHIMERYS, as well as scalable cloud compute instances and data lakes, the platform facilitates efficient processing of large data sets, automation of processing via the command line, systematic result storage, analysis, and visualization. The data lake supports elastically growing storage and unified query capabilities, facilitating large-scale analyses and efficient reuse of previously processed data, such as aggregating longitudinally acquired studies. Users interact with the platform via a web interface, CLI client, or API, providing flexible, automated access. Readily available tools for accessing result data include browser-based interrogation and one-click visualizations for statistical analysis. The platform streamlines research processes, making advanced and automated proteomic workflows accessible to a broader range of scientists. The MSAID Platform is globally available via https://platform.msaid.io.

摘要

蛋白质组学数据的指数级增长给传统处理工作流程带来了严峻挑战。这些流程通常由零散的软件包组成，通过复杂的内部脚本或在本地硬件上运行的容易出错的手动工作流程拼凑在一起，维护和扩展成本很高。MSAID平台提供了一个全自动的、可管理的蛋白质组学数据流程，将以前分散的功能整合为统一的、由API驱动的服务，涵盖从原始数据到生物学见解的整个过程。在云原生搜索算法CHIMERYS以及可扩展的云计算实例和数据湖的支持下，该平台有助于高效处理大型数据集，通过命令行实现处理自动化，进行系统的结果存储、分析和可视化。数据湖支持弹性增长的存储和统一查询功能，便于进行大规模分析和有效重用先前处理的数据，例如汇总纵向获取的研究数据。用户可以通过Web界面、CLI客户端或API与该平台进行交互，提供灵活的自动化访问方式。用于访问结果数据的现成工具包括基于浏览器的查询和用于统计分析的一键式可视化。该平台简化了研究流程，使更广泛的科学家能够使用先进的自动化蛋白质组学工作流程。MSAID平台可通过https://platform.msaid.io在全球范围内使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73fd/11894649/8aecec878766/pr4c00871_0001.jpg

相似文献

A Scalable, Web-Based Platform for Proteomics Data Processing, Result Storage and Analysis.

J Proteome Res. 2025 Mar 7;24(3):1241-1249. doi: 10.1021/acs.jproteome.4c00871. Epub 2025 Feb 21.

Processing shotgun proteomics data on the Amazon cloud with the trans-proteomic pipeline.

Mol Cell Proteomics. 2015 Feb;14(2):399-404. doi: 10.1074/mcp.O114.043380. Epub 2014 Nov 23.

Closha: bioinformatics workflow system for the analysis of massive sequencing data.

BMC Bioinformatics. 2018 Feb 19;19(Suppl 1):43. doi: 10.1186/s12859-018-2019-3.

Yabi: An online research environment for grid, high performance and cloud computing.

Source Code Biol Med. 2012 Feb 15;7(1):1. doi: 10.1186/1751-0473-7-1.

Transcriptome annotation in the cloud: complexity, best practices, and cost.

Gigascience. 2021 Jan 29;10(2). doi: 10.1093/gigascience/giaa163.

AnVILWorkflow: A runnable workflow package for Cloud-implemented bioinformatics analysis pipelines.

F1000Res. 2024 Oct 21;13:1257. doi: 10.12688/f1000research.155449.1. eCollection 2024.

DolphinNext: a distributed data processing platform for high throughput genomics.

BMC Genomics. 2020 Apr 19;21(1):310. doi: 10.1186/s12864-020-6714-x.

Cloud CPFP: a shotgun proteomics data analysis pipeline using cloud and high performance computing.

J Proteome Res. 2012 Dec 7;11(12):6282-90. doi: 10.1021/pr300694b. Epub 2012 Oct 29.

Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.

PLoS One. 2015 Oct 26;10(10):e0140829. doi: 10.1371/journal.pone.0140829. eCollection 2015.

ATAQS: A computational software tool for high throughput transition optimization and validation for selected reaction monitoring mass spectrometry.

BMC Bioinformatics. 2011 Mar 18;12:78. doi: 10.1186/1471-2105-12-78.

本文引用的文献

Beyond protein lists: AI-assisted interpretation of proteomic investigations in the context of evolving scientific knowledge.

Nat Methods. 2024 Aug;21(8):1387-1389. doi: 10.1038/s41592-024-02324-4.

quantms: a cloud-based pipeline for quantitative proteomics enables the reanalysis of public proteomics data.

Nat Methods. 2024 Sep;21(9):1603-1607. doi: 10.1038/s41592-024-02343-1. Epub 2024 Jul 4.

Instrumentation at the Leading Edge of Proteomics.

Anal Chem. 2024 May 21;96(20):7976-8010. doi: 10.1021/acs.analchem.3c04497. Epub 2024 May 13.

The One Hour Human Proteome.

Mol Cell Proteomics. 2024 May;23(5):100760. doi: 10.1016/j.mcpro.2024.100760. Epub 2024 Apr 3.

Ultra-fast label-free quantification and comprehensive proteome coverage with narrow-window data-independent acquisition.

Nat Biotechnol. 2024 Dec;42(12):1855-1866. doi: 10.1038/s41587-023-02099-7. Epub 2024 Feb 1.

Evaluating the Performance of the Astral Mass Analyzer for Quantitative Proteomics Using Data-Independent Acquisition.

J Proteome Res. 2023 Oct 6;22(10):3290-3300. doi: 10.1021/acs.jproteome.3c00357. Epub 2023 Sep 8.

MSBooster: improving peptide identification rates using deep learning-based features.

Nat Commun. 2023 Jul 27;14(1):4539. doi: 10.1038/s41467-023-40129-9.

Analysis of DIA proteomics data using MSFragger-DIA and FragPipe computational platform.

Nat Commun. 2023 Jul 12;14(1):4154. doi: 10.1038/s41467-023-39869-5.

MSstats Version 4.0: Statistical Analyses of Quantitative Mass Spectrometry-Based Proteomic Experiments with Chromatography-Based Quantification at Scale.

J Proteome Res. 2023 May 5;22(5):1466-1482. doi: 10.1021/acs.jproteome.2c00834. Epub 2023 Apr 5.

AlphaPeptDeep: a modular deep learning framework to predict peptide properties for proteomics.

Nat Commun. 2022 Nov 24;13(1):7238. doi: 10.1038/s41467-022-34904-3.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一个用于蛋白质组学数据处理、结果存储和分析的可扩展的基于网络的平台。

A Scalable, Web-Based Platform for Proteomics Data Processing, Result Storage and Analysis.

作者信息

机构信息

MSAID GmbH, Garching b. München 85748, Germany.

MSAID GmbH, Berlin 13347, Germany.

出版信息

J Proteome Res. 2025 Mar 7;24(3):1241-1249. doi: 10.1021/acs.jproteome.4c00871. Epub 2025 Feb 21.

DOI:10.1021/acs.jproteome.4c00871

PMID:39982847

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11894649/

Abstract

摘要

一个用于蛋白质组学数据处理、结果存储和分析的可扩展的基于网络的平台。

A Scalable, Web-Based Platform for Proteomics Data Processing, Result Storage and Analysis.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

一个用于蛋白质组学数据处理、结果存储和分析的可扩展的基于网络的平台。

A Scalable, Web-Based Platform for Proteomics Data Processing, Result Storage and Analysis.

作者信息

机构信息

出版信息

相似文献

本文引用的文献