Schneider Markus, Zolg Daniel P, Samaras Patroklos, Ben Fredj Samia, Bold Dulguun, Guevende Agnes, Hogrebe Alexander, Berger Michelle T, Graber Michael, Sukumar Vishal, Mamisashvili Lizi, Bronsthein Igor, Eljagh Layla, Gessulat Siegfried, Seefried Florian, Schmidt Tobias, Frejno Martin
MSAID GmbH, Garching b. München 85748, Germany.
MSAID GmbH, Berlin 13347, Germany.
J Proteome Res. 2025 Mar 7;24(3):1241-1249. doi: 10.1021/acs.jproteome.4c00871. Epub 2025 Feb 21.
The exponential increase in proteomics data presents critical challenges for conventional processing workflows. These pipelines often consist of fragmented software packages, glued together using complex in-house scripts or error-prone manual workflows running on local hardware, which are costly to maintain and scale. The MSAID Platform offers a fully automated, managed proteomics data pipeline, consolidating formerly disjointed functions into unified, API-driven services that cover the entire process from raw data to biological insights. Backed by the cloud-native search algorithm CHIMERYS, as well as scalable cloud compute instances and data lakes, the platform facilitates efficient processing of large data sets, automation of processing via the command line, systematic result storage, analysis, and visualization. The data lake supports elastically growing storage and unified query capabilities, facilitating large-scale analyses and efficient reuse of previously processed data, such as aggregating longitudinally acquired studies. Users interact with the platform via a web interface, CLI client, or API, providing flexible, automated access. Readily available tools for accessing result data include browser-based interrogation and one-click visualizations for statistical analysis. The platform streamlines research processes, making advanced and automated proteomic workflows accessible to a broader range of scientists. The MSAID Platform is globally available via https://platform.msaid.io.
蛋白质组学数据的指数级增长给传统处理工作流程带来了严峻挑战。这些流程通常由零散的软件包组成,通过复杂的内部脚本或在本地硬件上运行的容易出错的手动工作流程拼凑在一起,维护和扩展成本很高。MSAID平台提供了一个全自动的、可管理的蛋白质组学数据流程,将以前分散的功能整合为统一的、由API驱动的服务,涵盖从原始数据到生物学见解的整个过程。在云原生搜索算法CHIMERYS以及可扩展的云计算实例和数据湖的支持下,该平台有助于高效处理大型数据集,通过命令行实现处理自动化,进行系统的结果存储、分析和可视化。数据湖支持弹性增长的存储和统一查询功能,便于进行大规模分析和有效重用先前处理的数据,例如汇总纵向获取的研究数据。用户可以通过Web界面、CLI客户端或API与该平台进行交互,提供灵活的自动化访问方式。用于访问结果数据的现成工具包括基于浏览器的查询和用于统计分析的一键式可视化。该平台简化了研究流程,使更广泛的科学家能够使用先进的自动化蛋白质组学工作流程。MSAID平台可通过https://platform.msaid.io在全球范围内使用。