Moreews François, Sallou Olivier, Ménager Hervé, Le Bras Yvan, Monjeaud Cyril, Blanchet Christophe, Collin Olivier
Genscale team, IRISA, Rennes, France.
Genouest Bioinformatics Facility, University of Rennes 1/IRISA, Rennes, France.
F1000Res. 2015 Dec 14;4:1443. doi: 10.12688/f1000research.7536.1. eCollection 2015.
Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community.
以Docker为代表的Linux容器技术,为科学软件所需的复杂且耗时的安装过程提供了一种替代方案。它们易于部署、能够实现进程隔离,以及在不同环境和版本间具有可重复性,这些特性使它们成为构建生物信息基础设施的理想选择,适用于从单台工作站到高通量计算架构的任何规模。Docker Hub是一个公共注册表,可用于将生物信息软件作为Docker镜像进行分发。然而,其缺乏管理且通用性强,使得生物信息学用户难以找到所需的最合适镜像。BioShaDock是一个专注于生物信息学的Docker注册表,它提供了一个本地且完全可控的环境,用于构建和发布作为便携式Docker镜像的生物信息软件。它在认证和权限管理方面对基础Docker注册表进行了多项改进,使其能够集成到现有的生物信息基础设施(如计算平台)中。与注册镜像相关的元数据以领域为中心,例如包括EDAM本体中定义的概念,EDAM本体是生物信息学中常用术语的共享结构化词汇表。该注册表还包括用户定义的标签以方便发现,以及如果工具已存在于ELIXIR注册表中,则提供到工具描述的链接。如果不存在,BioShaDock注册表将与该注册表同步,根据BioShaDock条目元数据在Elixir注册表中创建新描述。此链接将帮助用户获取有关该工具的更多信息,如EDAM操作、输入和输出类型。这允许与ELIXIR工具和数据服务注册表集成,从而为生物信息学社区提供此类镜像的适当可见性。