Bretaudeau Anthony, Monjeaud Cyril, Le Bras Yvan, Legeai Fabrice, Collin Olivier
INRA, UMR Institut de Génétique, Environnement et Protection des Plantes (IGEPP), BioInformatics Platform for Agroecosystems Arthropods (BIPAA), Campus Beaulieu, Rennes, 35042 France ; INRIA, IRISA, GenOuest Core Facility, Campus de Beaulieu, Rennes, 35042 France.
INRIA, IRISA, GenOuest Core Facility, Campus de Beaulieu, Rennes, 35042 France.
Gigascience. 2015 May 9;4:22. doi: 10.1186/s13742-015-0063-8. eCollection 2015.
Many bioinformatics tools use reference data, such as genome assemblies or sequence databanks. Galaxy offers multiple ways to give access to this data through its web interface. However, the process of adding new reference data was customarily manual and time consuming, even more so when this data needed to be indexed in a variety of formats (e.g. Blast, Bowtie, BWA, or 2bit). BioMAJ is a widely used and stable software that is designed to automate the download and transformation of data from various sources. This data can be used directly from the command line, in more complex systems, such as Mobyle, or by using a REST API.
To ease the process of giving access to reference data in Galaxy, we have developed the BioMAJ2Galaxy module, which enables the gap between BioMAJ and Galaxy to be bridged. With this module, it is now possible to configure BioMAJ to automatically download some reference data, to then convert and/or index it in various formats, and then make this data available in a Galaxy server using data libraries or data managers.
The developments presented in this paper allow us to integrate the reference data in Galaxy in an automatic, reliable, and diskspace-saving way. The code is freely available on the GenOuest GitHub account (https://github.com/genouest/biomaj2galaxy).
许多生物信息学工具使用参考数据,如基因组组装或序列数据库。Galaxy提供了多种通过其网页界面访问这些数据的方式。然而,添加新参考数据的过程通常是手动的且耗时,当这些数据需要以多种格式(如Blast、Bowtie、BWA或2bit)进行索引时更是如此。BioMAJ是一个广泛使用且稳定的软件,旨在自动化从各种来源下载和转换数据。此数据可直接从命令行使用,在更复杂的系统(如Mobyle)中使用,或通过使用REST API使用。
为简化在Galaxy中访问参考数据的过程,我们开发了BioMAJ2Galaxy模块,它能够弥合BioMAJ与Galaxy之间的差距。借助此模块,现在可以配置BioMAJ自动下载一些参考数据,然后将其转换和/或索引为各种格式,接着使用数据库或数据管理器在Galaxy服务器中提供此数据。
本文介绍的进展使我们能够以自动、可靠且节省磁盘空间的方式在Galaxy中整合参考数据。代码可在GenOuest GitHub账户(https://github.com/genouest/biomaj2galaxy)上免费获取。