• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

生命科学中工作流自动化组合的观点。

Perspectives on automated composition of workflows in the life sciences.

机构信息

Utrecht University, 3584 CS Utrecht, The Netherlands.

Leiden University Medical Center, 2333 ZA, Leiden, The Netherlands.

出版信息

F1000Res. 2021 Sep 7;10:897. doi: 10.12688/f1000research.54159.1. eCollection 2021.

DOI:10.12688/f1000research.54159.1
PMID:34804501
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8573700/
Abstract

Scientific data analyses often combine several computational tools in automated pipelines, or workflows. Thousands of such workflows have been used in the life sciences, though their composition has remained a cumbersome manual process due to a lack of standards for annotation, assembly, and implementation. Recent technological advances have returned the long-standing vision of automated workflow composition into focus. This article summarizes a recent Lorentz Center workshop dedicated to automated composition of workflows in the life sciences. We survey previous initiatives to automate the composition process, and discuss the current state of the art and future perspectives. We start by drawing the "big picture" of the scientific workflow development life cycle, before surveying and discussing current methods, technologies and practices for semantic domain modelling, automation in workflow development, and workflow assessment. Finally, we derive a roadmap of individual and community-based actions to work toward the vision of automated workflow development in the forthcoming years. A central outcome of the workshop is a general description of the workflow life cycle in six stages: 1) scientific question or hypothesis, 2) conceptual workflow, 3) abstract workflow, 4) concrete workflow, 5) production workflow, and 6) scientific results. The transitions between stages are facilitated by diverse tools and methods, usually incorporating domain knowledge in some form. Formal semantic domain modelling is hard and often a bottleneck for the application of semantic technologies. However, life science communities have made considerable progress here in recent years and are continuously improving, renewing interest in the application of semantic technologies for workflow exploration, composition and instantiation. Combined with systematic benchmarking with reference data and large-scale deployment of production-stage workflows, such technologies enable a more systematic process of workflow development than we know today. We believe that this can lead to more robust, reusable, and sustainable workflows in the future.

摘要

科学数据分析通常在自动化管道或工作流中结合使用多个计算工具。尽管由于缺乏注释、组装和实施的标准,这些工作流的组合仍然是一个繁琐的手动过程,但数千种这样的工作流已经在生命科学中得到了应用。最近的技术进步使自动化工作流组合的长期愿景重新成为焦点。本文总结了最近在洛伦兹中心举办的一次专门讨论生命科学中自动化工作流组合的研讨会。我们调查了以前自动化组合过程的举措,并讨论了当前的技术水平和未来的展望。我们首先绘制了科学工作流开发生命周期的“全貌”,然后调查和讨论了当前用于语义领域建模、工作流开发中的自动化以及工作流评估的方法、技术和实践。最后,我们得出了个人和社区的行动路线图,以努力实现未来几年自动化工作流开发的愿景。研讨会的一个中心成果是对工作流生命周期的一般描述,分为六个阶段:1)科学问题或假设,2)概念工作流,3)抽象工作流,4)具体工作流,5)生产工作流,6)科学结果。阶段之间的转换由各种工具和方法来促进,这些方法通常以某种形式结合了领域知识。形式语义领域建模是困难的,通常是语义技术应用的瓶颈。然而,生命科学社区近年来在这方面取得了相当大的进展,并不断改进,重新引起了对语义技术在工作流探索、组合和实例化中的应用的兴趣。结合参考数据的系统基准测试和生产阶段工作流的大规模部署,这些技术使工作流开发的过程比我们目前所知道的更加系统。我们相信,这将导致未来更健壮、可重复使用和可持续的工作流。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9f0/8573700/cec715b04e18/f1000research-10-57615-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9f0/8573700/627add44b0d1/f1000research-10-57615-g0000.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9f0/8573700/cec715b04e18/f1000research-10-57615-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9f0/8573700/627add44b0d1/f1000research-10-57615-g0000.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9f0/8573700/cec715b04e18/f1000research-10-57615-g0001.jpg

相似文献

1
Perspectives on automated composition of workflows in the life sciences.生命科学中工作流自动化组合的观点。
F1000Res. 2021 Sep 7;10:897. doi: 10.12688/f1000research.54159.1. eCollection 2021.
2
APE in the Wild: Automated Exploration of Proteomics Workflows in the bio.tools Registry.野外的APE:生物工具注册中心中蛋白质组学工作流程的自动探索。
J Proteome Res. 2021 Apr 2;20(4):2157-2165. doi: 10.1021/acs.jproteome.0c00983. Epub 2021 Mar 15.
3
Automated workflow composition in mass spectrometry-based proteomics.基于质谱的蛋白质组学中的自动化工作流组合。
Bioinformatics. 2019 Feb 15;35(4):656-664. doi: 10.1093/bioinformatics/bty646.
4
Semantic workflows for benchmark challenges: Enhancing comparability, reusability and reproducibility.用于基准挑战的语义工作流:提高可比性、可重用性和可重复性。
Pac Symp Biocomput. 2019;24:208-219.
5
SADI, SHARE, and the in silico scientific method.胃空肠旁路吻合术、可调节胃束带术和计算机模拟科学方法。
BMC Bioinformatics. 2010 Dec 21;11 Suppl 12(Suppl 12):S7. doi: 10.1186/1471-2105-11-S12-S7.
6
Biowep: a workflow enactment portal for bioinformatics applications.生物工作流引擎(Biowep):一个用于生物信息学应用的工作流制定门户。
BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S19. doi: 10.1186/1471-2105-8-S1-S19.
7
Workflows for microarray data processing in the Kepler environment.在 Kepler 环境中进行微阵列数据处理的工作流程。
BMC Bioinformatics. 2012 May 17;13:102. doi: 10.1186/1471-2105-13-102.
8
Workflow sharing with automated metadata validation and test execution to improve the reusability of published workflows.通过自动化元数据验证和测试执行来共享工作流程,以提高已发布工作流程的可重用性。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad006. Epub 2023 Feb 22.
9
BioVeL: a virtual laboratory for data analysis and modelling in biodiversity science and ecology.生物多样性虚拟实验室(BioVeL):一个用于生物多样性科学与生态学数据分析及建模的虚拟实验室。
BMC Ecol. 2016 Oct 20;16(1):49. doi: 10.1186/s12898-016-0103-y.
10
Practical Computational Reproducibility in the Life Sciences.生命科学中的实用计算可重复性。
Cell Syst. 2018 Jun 27;6(6):631-635. doi: 10.1016/j.cels.2018.03.014.

引用本文的文献

1
Applying the FAIR Principles to computational workflows.将公平原则应用于计算工作流程。
Sci Data. 2025 Feb 24;12(1):328. doi: 10.1038/s41597-025-04451-9.
2
Evaluating FAIR Digital Object and Linked Data as distributed object systems.评估公平数字对象和关联数据作为分布式对象系统的情况。
PeerJ Comput Sci. 2024 Apr 30;10:e1781. doi: 10.7717/peerj-cs.1781. eCollection 2024.
3
Towards Machine-FAIR: Representing software and datasets to facilitate reuse and scientific discovery by machines.迈向机器 FAIR:通过机器来表示软件和数据集,以促进其重复利用和科学发现。

本文引用的文献

1
VECMAtk: a scalable verification, validation and uncertainty quantification toolkit for scientific simulations.VECMAtk:用于科学模拟的可扩展验证、确认和不确定性量化工具包。
Philos Trans A Math Phys Eng Sci. 2021 May 17;379(2197):20200221. doi: 10.1098/rsta.2020.0221. Epub 2021 Mar 29.
2
Reliability and reproducibility in computational science: implementing validation, verification and uncertainty quantification .计算科学中的可靠性与可重复性:实施验证、确认与不确定性量化
Philos Trans A Math Phys Eng Sci. 2021 May 17;379(2197):20200409. doi: 10.1098/rsta.2020.0409. Epub 2021 Mar 29.
3
APE in the Wild: Automated Exploration of Proteomics Workflows in the bio.tools Registry.
J Biomed Inform. 2024 Jun;154:104647. doi: 10.1016/j.jbi.2024.104647. Epub 2024 Apr 30.
4
An Automated Workflow Composition System for Liquid Chromatography-Mass Spectrometry Metabolomics Data Processing.一种用于液相色谱-质谱代谢组学数据处理的自动化工作流组合系统。
J Am Soc Mass Spectrom. 2023 Dec 6;34(12):2857-2863. doi: 10.1021/jasms.3c00248. Epub 2023 Oct 24.
5
Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software.使用研究软件的FAIR原则评估液相色谱-高分辨质谱代谢组学数据处理软件。
Metabolomics. 2023 Feb 6;19(2):11. doi: 10.1007/s11306-023-01974-3.
6
A Checklist for Reproducible Computational Analysis in Clinical Metabolomics Research.临床代谢组学研究中可重复计算分析的检查清单
Metabolites. 2022 Jan 17;12(1):87. doi: 10.3390/metabo12010087.
野外的APE:生物工具注册中心中蛋白质组学工作流程的自动探索。
J Proteome Res. 2021 Apr 2;20(4):2157-2165. doi: 10.1021/acs.jproteome.0c00983. Epub 2021 Mar 15.
4
BioContainers Registry: Searching Bioinformatics and Proteomics Tools, Packages, and Containers.生物容器注册中心:搜索生物信息学和蛋白质组学工具、包和容器。
J Proteome Res. 2021 Apr 2;20(4):2056-2061. doi: 10.1021/acs.jproteome.0c00904. Epub 2021 Feb 24.
5
biotoolsSchema: a formalized schema for bioinformatics software description.生物工具模式:生物信息学软件描述的形式化模式。
Gigascience. 2021 Jan 27;10(1). doi: 10.1093/gigascience/giaa157.
6
Tool recommender system in Galaxy using deep learning.Galaxy 中使用深度学习的工具推荐系统。
Gigascience. 2021 Jan 6;10(1). doi: 10.1093/gigascience/giaa152.
7
Automated machine learning: Review of the state-of-the-art and opportunities for healthcare.自动化机器学习:最新技术综述及医疗保健领域的机遇
Artif Intell Med. 2020 Apr;104:101822. doi: 10.1016/j.artmed.2020.101822. Epub 2020 Feb 21.
8
Community curation of bioinformatics software and data resources.生物信息学软件和数据资源的社区管理。
Brief Bioinform. 2020 Sep 25;21(5):1697-1705. doi: 10.1093/bib/bbz075.
9
Building Containerized Workflows Using the BioDepot-Workflow-Builder.使用 BioDepot-Workflow-Builder 构建容器化工作流程。
Cell Syst. 2019 Nov 27;9(5):508-514.e3. doi: 10.1016/j.cels.2019.08.007. Epub 2019 Sep 11.
10
Workflow systems turn raw data into scientific knowledge.工作流系统将原始数据转化为科学知识。
Nature. 2019 Sep;573(7772):149-150. doi: 10.1038/d41586-019-02619-z.