School of Computing and Information Science, University of Maine, Orono, ME 04469, USA.
Department of Molecular and Biomedical Sciences, University of Maine, Orono, ME 04469, USA.
Bioinformatics. 2022 Sep 30;38(19):4589-4597. doi: 10.1093/bioinformatics/btac556.
Environmental DNA (eDNA), as a rapidly expanding research field, stands to benefit from shared resources including sampling protocols, study designs, discovered sequences, and taxonomic assignments to sequences. High-quality community shareable eDNA resources rely heavily on comprehensive metadata documentation that captures the complex workflows covering field sampling, molecular biology lab work, and bioinformatic analyses. There are limited sources that provide documentation of database development on comprehensive metadata for eDNA and these workflows and no open-source software.
We present medna-metadata, an open-source, modular system that aligns with Findable, Accessible, Interoperable, and Reusable guiding principles that support scholarly data reuse and the database and application development of a standardized metadata collection structure that encapsulates critical aspects of field data collection, wet lab processing, and bioinformatic analysis. Medna-metadata is showcased with metabarcoding data from the Gulf of Maine (Polinski et al., 2019).
The source code of the medna-metadata web application is hosted on GitHub (https://github.com/Maine-eDNA/medna-metadata). Medna-metadata is a docker-compose installable package. Documentation can be found at https://medna-metadata.readthedocs.io/en/latest/?badge=latest. The application is implemented in Python, PostgreSQL and PostGIS, RabbitMQ, and NGINX, with all major browsers supported. A demo can be found at https://demo.metadata.maine-edna.org/.
Supplementary data are available at Bioinformatics online.
环境 DNA(eDNA)作为一个快速发展的研究领域,可以从共享资源中受益,包括采样方案、研究设计、发现的序列以及序列的分类分配。高质量的社区共享 eDNA 资源严重依赖于全面的元数据文档,这些文档记录了涵盖现场采样、分子生物学实验室工作和生物信息学分析的复杂工作流程。提供有关 eDNA 数据库开发以及这些工作流程全面元数据的文档的资源有限,并且没有开源软件。
我们提出了 medna-metadata,这是一个开源的、模块化的系统,符合可发现、可访问、可互操作和可重复使用的指导原则,支持学术数据重用以及数据库和应用程序开发标准化元数据收集结构,该结构封装了现场数据收集、湿实验室处理和生物信息学分析的关键方面。medna-metadata 展示了来自缅因湾的 metabarcoding 数据(Polinski 等人,2019 年)。
medna-metadata 网络应用程序的源代码托管在 GitHub 上(https://github.com/Maine-eDNA/medna-metadata)。medna-metadata 是一个可通过 docker-compose 安装的软件包。文档可在 https://medna-metadata.readthedocs.io/en/latest/?badge=latest 找到。该应用程序是用 Python、PostgreSQL 和 PostGIS、RabbitMQ 和 NGINX 实现的,支持所有主流浏览器。演示可以在 https://demo.metadata.maine-edna.org/ 找到。
补充数据可在 Bioinformatics 在线获得。