Université de Nantes, CNRS, INSERM, l'institut du thorax, Nantes, F-44000, France.
Université de Nantes, CHU Nantes, Inserm, CNRS, SFR Santé, Inserm UMS 016, CNRS UMS 3556, Nantes, F-44000, France.
F1000Res. 2021 Apr 26;10:320. doi: 10.12688/f1000research.52569.1. eCollection 2021.
Workflows are the keystone of bioimage analysis, and the NEUBIAS (Network of European BioImage AnalystS) community is trying to gather the actors of this field and organize the information around them. One of its most recent outputs is the opening of the F1000Research NEUBIAS gateway, whose main objective is to offer a channel of publication for bioimage analysis workflows and associated resources. In this paper we want to express some personal opinions and recommendations related to finding, handling and developing bioimage analysis workflows. The emergence of "big data" in bioimaging and resource-intensive analysis algorithms make local data storage and computing solutions a limiting factor. At the same time, the need for data sharing with collaborators and a general shift towards remote work, have created new challenges and avenues for the execution and sharing of bioimage analysis workflows. These challenges are to reproducibly run workflows in remote environments, in particular when their components come from different software packages, but also to document them and link their parameters and results by following the FAIR principles (Findable, Accessible, Interoperable, Reusable) to foster open and reproducible science. In this opinion paper, we focus on giving some directions to the reader to tackle these challenges and navigate through this complex ecosystem, in order to find and use workflows, and to compare workflows addressing the same problem. We also discuss tools to run workflows in the cloud and on High Performance Computing resources, and suggest ways to make these workflows FAIR.
工作流程是生物影像分析的基石,而 NEUBIAS(欧洲生物影像分析员网络)社区正试图汇集该领域的参与者,并围绕他们组织信息。其最近的成果之一是开放了 F1000Research NEUBIAS 网关,其主要目标是为生物影像分析工作流程及其相关资源提供一个发布渠道。在本文中,我们希望就生物影像分析工作流程的发现、处理和开发提出一些个人意见和建议。生物成像和资源密集型分析算法中“大数据”的出现,使得本地数据存储和计算解决方案成为一个限制因素。同时,由于需要与合作者共享数据以及向远程工作的转变,为生物影像分析工作流程的执行和共享带来了新的挑战和途径。这些挑战包括在远程环境中可重复运行工作流程,特别是当它们的组件来自不同的软件包时,以及按照 FAIR 原则(可发现、可访问、可互操作、可重用)对其进行记录并链接其参数和结果,以促进开放和可重复的科学。在这篇观点文章中,我们重点为读者提供一些指导,以应对这些挑战并在这个复杂的生态系统中进行导航,从而找到和使用工作流程,并比较解决同一问题的工作流程。我们还讨论了在云计算和高性能计算资源上运行工作流程的工具,并提出了使这些工作流程符合 FAIR 原则的方法。