• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于微服务的可互操作和可扩展数据分析:在代谢组学中的应用。

Interoperable and scalable data analysis with microservices: applications in metabolomics.

机构信息

Department of Medical Sciences, Clinical Chemistry, Uppsala University, Uppsala, Sweden.

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK.

出版信息

Bioinformatics. 2019 Oct 1;35(19):3752-3760. doi: 10.1093/bioinformatics/btz160.

DOI:10.1093/bioinformatics/btz160
PMID:30851093
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6761976/
Abstract

MOTIVATION

Developing a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected into scientific workflows and executed using the Kubernetes container orchestrator.

RESULTS

We developed a Virtual Research Environment (VRE) which facilitates rapid integration of new tools and developing scalable and interoperable workflows for performing metabolomics data analysis. The environment can be launched on-demand on cloud resources and desktop computers. IT-expertise requirements on the user side are kept to a minimum, and workflows can be re-used effortlessly by any novice user. We validate our method in the field of metabolomics on two mass spectrometry, one nuclear magnetic resonance spectroscopy and one fluxomics study. We showed that the method scales dynamically with increasing availability of computational resources. We demonstrated that the method facilitates interoperability using integration of the major software suites resulting in a turn-key workflow encompassing all steps for mass-spectrometry-based metabolomics including preprocessing, statistics and identification. Microservices is a generic methodology that can serve any scientific discipline and opens up for new types of large-scale integrative science.

AVAILABILITY AND IMPLEMENTATION

The PhenoMeNal consortium maintains a web portal (https://portal.phenomenal-h2020.eu) providing a GUI for launching the Virtual Research Environment. The GitHub repository https://github.com/phnmnl/ hosts the source code of all projects.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

开发一个强大且高效的数据分析工作流程,该流程集成了所有必要的组件,同时仍能够在多个计算节点上扩展,这是一项具有挑战性的任务。我们引入了一种基于微服务架构的通用方法,其中软件工具被封装为 Docker 容器,可以连接到科学工作流程中,并使用 Kubernetes 容器编排器执行。

结果

我们开发了一个虚拟研究环境(VRE),它促进了新工具的快速集成,并为执行代谢组学数据分析开发了可扩展和互操作的工作流程。该环境可以按需在云资源和桌面计算机上启动。用户端的 IT 专业知识要求降至最低,并且任何新手用户都可以轻松地重复使用工作流程。我们在代谢组学领域的两项质谱、一项核磁共振波谱学和一项通量组学研究中验证了我们的方法。我们表明,该方法可以随着计算资源可用性的增加而动态扩展。我们证明,该方法通过整合主要软件套件来促进互操作性,从而形成一个包含基于质谱的代谢组学的所有步骤(包括预处理、统计和鉴定)的即用型工作流程。微服务是一种通用方法,可以为任何科学学科服务,并为新型大规模综合科学开辟道路。

可用性和实现

PhenoMeNal 联盟维护一个门户网站(https://portal.phenomenal-h2020.eu),提供用于启动虚拟研究环境的 GUI。GitHub 存储库 https://github.com/phnmnl/ 托管所有项目的源代码。

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed1a/6761976/811ebe36bdd3/btz160f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed1a/6761976/7a2c0c240c83/btz160f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed1a/6761976/b7727b0e7922/btz160f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed1a/6761976/1edd1b9dfd83/btz160f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed1a/6761976/6b440cabdec1/btz160f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed1a/6761976/499a096852d4/btz160f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed1a/6761976/811ebe36bdd3/btz160f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed1a/6761976/7a2c0c240c83/btz160f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed1a/6761976/b7727b0e7922/btz160f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed1a/6761976/1edd1b9dfd83/btz160f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed1a/6761976/6b440cabdec1/btz160f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed1a/6761976/499a096852d4/btz160f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed1a/6761976/811ebe36bdd3/btz160f6.jpg

相似文献

1
Interoperable and scalable data analysis with microservices: applications in metabolomics.基于微服务的可互操作和可扩展数据分析:在代谢组学中的应用。
Bioinformatics. 2019 Oct 1;35(19):3752-3760. doi: 10.1093/bioinformatics/btz160.
2
PhenoMeNal: processing and analysis of metabolomics data in the cloud.PhenoMeNal:云端代谢组学数据的处理和分析。
Gigascience. 2019 Feb 1;8(2). doi: 10.1093/gigascience/giy149.
3
Container-based bioinformatics with Pachyderm.基于容器的生物信息学与 Pachyderm。
Bioinformatics. 2019 Mar 1;35(5):839-846. doi: 10.1093/bioinformatics/bty699.
4
Scalable Data Analysis in Proteomics and Metabolomics Using BioContainers and Workflows Engines.使用 BioContainers 和工作流引擎进行蛋白质组学和代谢组学的可扩展数据分析。
Proteomics. 2020 May;20(9):e1900147. doi: 10.1002/pmic.201900147. Epub 2019 Dec 18.
5
Workflow4Metabolomics (W4M): A User-Friendly Metabolomics Platform for Analysis of Mass Spectrometry and Nuclear Magnetic Resonance Data.代谢组学工作流程4(W4M):一个用于质谱和核磁共振数据分析的用户友好型代谢组学平台。
Curr Protoc. 2025 Feb;5(2):e70095. doi: 10.1002/cpz1.70095.
6
Tibanna: software for scalable execution of portable pipelines on the cloud.Tibanna:用于在云端可扩展执行可移植管道的软件。
Bioinformatics. 2019 Nov 1;35(21):4424-4426. doi: 10.1093/bioinformatics/btz379.
7
Galaxy-M: a Galaxy workflow for processing and analyzing direct infusion and liquid chromatography mass spectrometry-based metabolomics data.Galaxy-M:一种用于处理和分析基于直接进样和液相色谱质谱联用的代谢组学数据的Galaxy工作流程。
Gigascience. 2016 Feb 23;5:10. doi: 10.1186/s13742-016-0115-8. eCollection 2016.
8
Create, run, share, publish, and reference your LC-MS, FIA-MS, GC-MS, and NMR data analysis workflows with the Workflow4Metabolomics 3.0 Galaxy online infrastructure for metabolomics.通过用于代谢组学的Workflow4Metabolomics 3.0 Galaxy在线基础设施,创建、运行、共享、发布和引用您的液相色谱-质谱联用(LC-MS)、流动注射分析-质谱联用(FIA-MS)、气相色谱-质谱联用(GC-MS)和核磁共振(NMR)数据分析工作流程。
Int J Biochem Cell Biol. 2017 Dec;93:89-101. doi: 10.1016/j.biocel.2017.07.002. Epub 2017 Jul 12.
9
wft4galaxy: a workflow testing tool for galaxy.wft4galaxy:用于 Galaxy 的工作流测试工具。
Bioinformatics. 2017 Dec 1;33(23):3805-3807. doi: 10.1093/bioinformatics/btx461.
10
Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics.代谢组学工作流程4:用于计算代谢组学的协作研究基础设施。
Bioinformatics. 2015 May 1;31(9):1493-5. doi: 10.1093/bioinformatics/btu813. Epub 2014 Dec 19.

引用本文的文献

1
Metabolomics: An Emerging "Omics" Platform for Systems Biology and Its Implications for Huntington Disease Research.代谢组学:一种用于系统生物学的新兴“组学”平台及其对亨廷顿病研究的意义
Metabolites. 2023 Dec 18;13(12):1203. doi: 10.3390/metabo13121203.
2
Disease phenotype prediction in multiple sclerosis.多发性硬化症的疾病表型预测
iScience. 2023 May 19;26(6):106906. doi: 10.1016/j.isci.2023.106906. eCollection 2023 Jun 16.
3
From biomedical cloud platforms to microservices: next steps in FAIR data and analysis.从生物医学云平台到微服务:FAIR 数据和分析的下一步。

本文引用的文献

1
On-demand virtual research environments using microservices.使用微服务的按需虚拟研究环境。
PeerJ Comput Sci. 2019 Nov 11;5:e232. doi: 10.7717/peerj-cs.232. eCollection 2019.
2
Cloud computing for genomic data analysis and collaboration.用于基因组数据分析与协作的云计算。
Nat Rev Genet. 2018 May;19(5):325. doi: 10.1038/nrg.2018.8. Epub 2018 Feb 12.
3
nmrML: A Community Supported Open Data Standard for the Description, Storage, and Exchange of NMR Data.nmrML:用于描述、存储和交换 NMR 数据的社区支持的开放数据标准。
Sci Data. 2022 Sep 8;9(1):553. doi: 10.1038/s41597-022-01619-5.
4
On-demand virtual research environments using microservices.使用微服务的按需虚拟研究环境。
PeerJ Comput Sci. 2019 Nov 11;5:e232. doi: 10.7717/peerj-cs.232. eCollection 2019.
5
Tackling the Challenges of 21-Century Open Science and Beyond: A Data Science Lab Approach.应对21世纪及未来开放科学的挑战:一种数据科学实验室方法。
Patterns (N Y). 2020 Sep 17;1(7):100103. doi: 10.1016/j.patter.2020.100103. eCollection 2020 Oct 9.
6
FAIR digital objects in environmental and life sciences should comprise workflow operation design data and method information for repeatability of study setups and reproducibility of results.环境与生命科学领域的FAIR数字对象应包含工作流程操作设计数据和方法信息,以实现研究设置的可重复性和结果的可再现性。
Database (Oxford). 2020 Jan 1;2020. doi: 10.1093/database/baaa059.
7
Laniakea: an open solution to provide Galaxy "on-demand" instances over heterogeneous cloud infrastructures.拉尼亚凯亚超星系团:一种提供 Galaxy“按需”实例的开放式解决方案,可在异构云基础架构上使用。
Gigascience. 2020 Apr 1;9(4). doi: 10.1093/gigascience/giaa033.
8
Targeted metabolomics of CSF in healthy individuals and patients with secondary progressive multiple sclerosis using high-resolution mass spectrometry.采用高分辨质谱法对健康个体和复发性进展型多发性硬化症患者脑脊液进行靶向代谢组学分析。
Metabolomics. 2020 Feb 12;16(2):26. doi: 10.1007/s11306-020-1648-5.
9
Experience in Developing an FHIR Medical Data Management Platform to Provide Clinical Decision Support.开发 FHIR 医疗数据管理平台以提供临床决策支持的经验。
Int J Environ Res Public Health. 2019 Dec 20;17(1):73. doi: 10.3390/ijerph17010073.
10
PhenoMeNal: processing and analysis of metabolomics data in the cloud.PhenoMeNal:云端代谢组学数据的处理和分析。
Gigascience. 2019 Feb 1;8(2). doi: 10.1093/gigascience/giy149.
Anal Chem. 2018 Jan 2;90(1):649-656. doi: 10.1021/acs.analchem.7b02795. Epub 2017 Dec 14.
4
Software simplified.软件简化。
Nature. 2017 May 29;546(7656):173-174. doi: 10.1038/546173a.
5
Jupyter and Galaxy: Easing entry barriers into complex data analyses for biomedical researchers.Jupyter与Galaxy:降低生物医学研究人员进入复杂数据分析领域的门槛。
PLoS Comput Biol. 2017 May 25;13(5):e1005425. doi: 10.1371/journal.pcbi.1005425. eCollection 2017 May.
6
Nextflow enables reproducible computational workflows.Nextflow支持可重复的计算工作流程。
Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820.
7
BioContainers: an open-source and community-driven framework for software standardization.生物容器:一个开源且由社区驱动的软件标准化框架。
Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192.
8
Kynurenine pathway metabolomics predicts and provides mechanistic insight into multiple sclerosis progression.犬尿氨酸途径代谢组学可预测多发性硬化症的进展,并提供其发病机制的深入见解。
Sci Rep. 2017 Feb 3;7:41473. doi: 10.1038/srep41473.
9
Metabolizing Data in the Cloud.在云端处理代谢数据。
Trends Biotechnol. 2017 Jun;35(6):481-483. doi: 10.1016/j.tibtech.2016.12.010. Epub 2017 Jan 20.
10
Data Streaming for Metabolomics: Accelerating Data Processing and Analysis from Days to Minutes.代谢组学数据流:将数据处理和分析时间从数天缩短至数分钟。
Anal Chem. 2017 Jan 17;89(2):1254-1259. doi: 10.1021/acs.analchem.6b03890. Epub 2017 Jan 3.