一种用于将基因组学元数据转换、同行评审和发表为组学数据论文的简化工作流程。

A streamlined workflow for conversion, peer review, and publication of genomics metadata as omics data papers.

机构信息

Pensoft Publishers, Prof. Georgi Zlatarski Street 12, 1700 Sofia, Bulgaria.

Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Acad. G. Bonchev St., Block 25A, 1113 Sofia, Bulgaria.

出版信息

Gigascience. 2021 May 13;10(5). doi: 10.1093/gigascience/giab034.

DOI:10.1093/gigascience/giab034

PMID:33983435

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8117446/

Abstract

BACKGROUND

Data papers have emerged as a powerful instrument for open data publishing, obtaining credit, and establishing priority for datasets generated in scientific experiments. Academic publishing improves data and metadata quality through peer review and increases the impact of datasets by enhancing their visibility, accessibility, and reusability.

OBJECTIVE

We aimed to establish a new type of article structure and template for omics studies: the omics data paper. To improve data interoperability and further incentivize researchers to publish well-described datasets, we created a prototype workflow for streamlined import of genomics metadata from the European Nucleotide Archive directly into a data paper manuscript.

METHODS

An omics data paper template was designed by defining key article sections that encourage the description of omics datasets and methodologies. A metadata import workflow, based on REpresentational State Transfer services and Xpath, was prototyped to extract information from the European Nucleotide Archive, ArrayExpress, and BioSamples databases.

FINDINGS

The template and workflow for automatic import of standard-compliant metadata into an omics data paper manuscript provide a mechanism for enhancing existing metadata through publishing.

CONCLUSION

The omics data paper structure and workflow for import of genomics metadata will help to bring genomic and other omics datasets into the spotlight. Promoting enhanced metadata descriptions and enforcing manuscript peer review and data auditing of the underlying datasets brings additional quality to datasets. We hope that streamlined metadata reuse for scholarly publishing encourages authors to create enhanced metadata descriptions in the form of data papers to improve both the quality of their metadata and its findability and accessibility.

摘要

背景

数据论文已经成为发表开放数据、获得学分和为科学实验中生成的数据集确定优先级的有力工具。学术出版通过同行评审提高数据和元数据质量，并通过提高数据集的可见性、可访问性和可重用性来增加其影响力。

目的

我们旨在为组学研究建立一种新型的文章结构和模板：组学数据论文。为了提高数据互操作性，并进一步激励研究人员发布描述良好的数据集，我们创建了一个原型工作流程，用于从欧洲核苷酸档案库（European Nucleotide Archive）直接将基因组学元数据导入数据论文手稿中，实现流程的简化。

方法

通过定义鼓励描述组学数据集和方法的关键文章部分，设计了组学数据论文模板。基于代表性状态传输（REpresentational State Transfer）服务和 XPath，我们设计了一个元数据导入工作流程，用于从欧洲核苷酸档案库、ArrayExpress 和 BioSamples 数据库中提取信息。

发现

将符合标准的元数据自动导入组学数据论文手稿的模板和工作流程提供了一种通过出版增强现有元数据的机制。

结论

组学数据论文结构和导入基因组学元数据的工作流程将有助于将基因组和其他组学数据集推向关注的焦点。通过加强对元数据描述的要求，并对底层数据集进行同行评审和数据审核，为数据集增加了额外的质量。我们希望，为学术出版简化元数据重用将鼓励作者以数据论文的形式创建增强的元数据描述，从而提高元数据的质量及其可发现性和可访问性。