• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于将基因组学元数据转换、同行评审和发表为组学数据论文的简化工作流程。

A streamlined workflow for conversion, peer review, and publication of genomics metadata as omics data papers.

机构信息

Pensoft Publishers, Prof. Georgi Zlatarski Street 12, 1700 Sofia, Bulgaria.

Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Acad. G. Bonchev St., Block 25A, 1113 Sofia, Bulgaria.

出版信息

Gigascience. 2021 May 13;10(5). doi: 10.1093/gigascience/giab034.

DOI:10.1093/gigascience/giab034
PMID:33983435
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8117446/
Abstract

BACKGROUND

Data papers have emerged as a powerful instrument for open data publishing, obtaining credit, and establishing priority for datasets generated in scientific experiments. Academic publishing improves data and metadata quality through peer review and increases the impact of datasets by enhancing their visibility, accessibility, and reusability.

OBJECTIVE

We aimed to establish a new type of article structure and template for omics studies: the omics data paper. To improve data interoperability and further incentivize researchers to publish well-described datasets, we created a prototype workflow for streamlined import of genomics metadata from the European Nucleotide Archive directly into a data paper manuscript.

METHODS

An omics data paper template was designed by defining key article sections that encourage the description of omics datasets and methodologies. A metadata import workflow, based on REpresentational State Transfer services and Xpath, was prototyped to extract information from the European Nucleotide Archive, ArrayExpress, and BioSamples databases.

FINDINGS

The template and workflow for automatic import of standard-compliant metadata into an omics data paper manuscript provide a mechanism for enhancing existing metadata through publishing.

CONCLUSION

The omics data paper structure and workflow for import of genomics metadata will help to bring genomic and other omics datasets into the spotlight. Promoting enhanced metadata descriptions and enforcing manuscript peer review and data auditing of the underlying datasets brings additional quality to datasets. We hope that streamlined metadata reuse for scholarly publishing encourages authors to create enhanced metadata descriptions in the form of data papers to improve both the quality of their metadata and its findability and accessibility.

摘要

背景

数据论文已经成为发表开放数据、获得学分和为科学实验中生成的数据集确定优先级的有力工具。学术出版通过同行评审提高数据和元数据质量,并通过提高数据集的可见性、可访问性和可重用性来增加其影响力。

目的

我们旨在为组学研究建立一种新型的文章结构和模板:组学数据论文。为了提高数据互操作性,并进一步激励研究人员发布描述良好的数据集,我们创建了一个原型工作流程,用于从欧洲核苷酸档案库(European Nucleotide Archive)直接将基因组学元数据导入数据论文手稿中,实现流程的简化。

方法

通过定义鼓励描述组学数据集和方法的关键文章部分,设计了组学数据论文模板。基于代表性状态传输(REpresentational State Transfer)服务和 XPath,我们设计了一个元数据导入工作流程,用于从欧洲核苷酸档案库、ArrayExpress 和 BioSamples 数据库中提取信息。

发现

将符合标准的元数据自动导入组学数据论文手稿的模板和工作流程提供了一种通过出版增强现有元数据的机制。

结论

组学数据论文结构和导入基因组学元数据的工作流程将有助于将基因组和其他组学数据集推向关注的焦点。通过加强对元数据描述的要求,并对底层数据集进行同行评审和数据审核,为数据集增加了额外的质量。我们希望,为学术出版简化元数据重用将鼓励作者以数据论文的形式创建增强的元数据描述,从而提高元数据的质量及其可发现性和可访问性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4eea/8117446/34578066e2bd/giab034fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4eea/8117446/6f3fd3bd3ca5/giab034fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4eea/8117446/4dbe4994ca6f/giab034fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4eea/8117446/34578066e2bd/giab034fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4eea/8117446/6f3fd3bd3ca5/giab034fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4eea/8117446/4dbe4994ca6f/giab034fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4eea/8117446/34578066e2bd/giab034fig3.jpg

相似文献

1
A streamlined workflow for conversion, peer review, and publication of genomics metadata as omics data papers.一种用于将基因组学元数据转换、同行评审和发表为组学数据论文的简化工作流程。
Gigascience. 2021 May 13;10(5). doi: 10.1093/gigascience/giab034.
2
"METAGENOTE: a simplified web platform for metadata annotation of genomic samples and streamlined submission to NCBI's sequence read archive".METAGENOTE:一个简化的基因组样本元数据注释的网络平台,简化了向 NCBI 的序列读取档案提交的流程。
BMC Bioinformatics. 2020 Sep 3;21(1):378. doi: 10.1186/s12859-020-03694-0.
3
OMD Curation Toolkit: a workflow for in-house curation of public omics datasets.OMD 策管工具包:公共组学数据集内部策管工作流程。
BMC Bioinformatics. 2024 May 9;25(1):184. doi: 10.1186/s12859-024-05803-9.
4
FAIR data station for lightweight metadata management and validation of omics studies.用于轻量级元数据管理和验证组学研究的 FAIR 数据站。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad014. Epub 2023 Mar 6.
5
FAIR-compliant clinical, radiomics and DICOM metadata of RIDER, interobserver, Lung1 and head-Neck1 TCIA collections.符合 FAIR 原则的 RIDER、观察者间一致性、Lung1 和 head-Neck1 TCIA 数据集的临床、影像组学和 DICOM 元数据。
Med Phys. 2020 Nov;47(11):5931-5940. doi: 10.1002/mp.14322. Epub 2020 Jun 27.
6
From Raw Data to FAIR Data: The FAIRification Workflow for Health Research.从原始数据到 FAIR 数据:健康研究的 FAIR 化工作流程。
Methods Inf Med. 2020 Jun;59(S 01):e21-e32. doi: 10.1055/s-0040-1713684. Epub 2020 Jul 3.
7
OMeta: an ontology-based, data-driven metadata tracking system.OMeta:一个基于本体论的数据驱动的元数据跟踪系统。
BMC Bioinformatics. 2019 Jan 7;20(1):8. doi: 10.1186/s12859-018-2580-9.
8
A multi-omics data analysis workflow packaged as a FAIR Digital Object.一个被打包为 FAIR 数字对象的多组学数据分析工作流。
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giad115.
9
Improving the discoverability, accessibility, and citability of omics datasets: a case report.提高组学数据集的可发现性、可访问性和可引用性:一项病例报告。
J Am Med Inform Assoc. 2017 Mar 1;24(2):388-393. doi: 10.1093/jamia/ocw096.
10
BioSamples database: FAIRer samples metadata to accelerate research data management.生物样本数据库:FAIRer 样本元数据加速研究数据管理。
Nucleic Acids Res. 2022 Jan 7;50(D1):D1500-D1507. doi: 10.1093/nar/gkab1046.

引用本文的文献

1
Importance of timely metadata curation to the global surveillance of genetic diversity.及时进行元数据策管对全球遗传多样性监测的重要性。
Conserv Biol. 2023 Aug;37(4):e14061. doi: 10.1111/cobi.14061. Epub 2023 Mar 10.
2
Recent developments and future directions in meta-analysis of differential gene expression in livestock RNA-Seq.家畜RNA测序中差异基因表达的荟萃分析的最新进展与未来方向
Front Genet. 2022 Sep 19;13:983043. doi: 10.3389/fgene.2022.983043. eCollection 2022.
3
The state of Medusozoa genomics: current evidence and future challenges.

本文引用的文献

1
COVID-19 pandemic reveals the peril of ignoring metadata standards.COVID-19 大流行揭示了忽视元数据标准的危险。
Sci Data. 2020 Jun 19;7(1):188. doi: 10.1038/s41597-020-0524-5.
2
Chromosome genome assembly and annotation of the yellowbelly pufferfish with PacBio and Hi-C sequencing data.基于 PacBio 和 Hi-C 测序数据的黄肚炮弹鱼染色体基因组组装和注释。
Sci Data. 2019 Nov 8;6(1):267. doi: 10.1038/s41597-019-0279-z.
3
MetaboLights: a resource evolving in response to the needs of its scientific community.代谢组学文献共享资源库(MetaboLights):一个响应其科研群体需求而不断发展的资源库。
后生动物基因组学的现状:当前的证据和未来的挑战。
Gigascience. 2022 May 17;11. doi: 10.1093/gigascience/giac036.
4
Perspectives on rigor and reproducibility in single cell genomics.单细胞基因组学中关于严谨性和可重复性的观点。
PLoS Genet. 2022 May 10;18(5):e1010210. doi: 10.1371/journal.pgen.1010210. eCollection 2022 May.
Nucleic Acids Res. 2020 Jan 8;48(D1):D440-D444. doi: 10.1093/nar/gkz1019.
4
Access to RNA-sequencing data from 1,173 plant species: The 1000 Plant transcriptomes initiative (1KP).从 1173 种植物中获取 RNA 测序数据:1000 种植物转录组计划(1KP)。
Gigascience. 2019 Oct 1;8(10). doi: 10.1093/gigascience/giz126.
5
Quantifying the impact of public omics data.量化公共组学数据的影响。
Nat Commun. 2019 Aug 5;10(1):3512. doi: 10.1038/s41467-019-11461-w.
6
Five years of Scientific Data.五年科学数据。
Sci Data. 2019 May 28;6(1):72. doi: 10.1038/s41597-019-0065-y.
7
FAIRsharing as a community approach to standards, repositories and policies.FAIRsharing作为一种针对标准、存储库和政策的社区方法。
Nat Biotechnol. 2019 Apr;37(4):358-367. doi: 10.1038/s41587-019-0080-8.
8
ArrayExpress update - from bulk to single-cell expression data.ArrayExpress 更新——从批量到单细胞表达数据。
Nucleic Acids Res. 2019 Jan 8;47(D1):D711-D715. doi: 10.1093/nar/gky964.
9
Review of Drug Repositioning Approaches and Resources.药物重定位方法和资源综述。
Int J Biol Sci. 2018 Jul 13;14(10):1232-1244. doi: 10.7150/ijbs.24612. eCollection 2018.
10
Complete genome sequence of native strains isolated from intestinal tract of the crab sp.从蟹属物种肠道中分离出的本地菌株的全基因组序列
Data Brief. 2017 Nov 20;16:381-385. doi: 10.1016/j.dib.2017.11.049. eCollection 2018 Feb.