育盟数据：提供高质量的开放生命科学数据。

YummyData: providing high-quality open life science data.

机构信息

Database Center for Life Science, Research Organization of Information and Systems, Kashiwa, Japan.

Novartis Institutes for Biomedical Research, Basel, Switzerland.

出版信息

Database (Oxford). 2018 Jan 1;2018. doi: 10.1093/database/bay022.

DOI:10.1093/database/bay022

PMID:29688370

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5846286/

Abstract

Many life science datasets are now available via Linked Data technologies, meaning that they are represented in a common format (the Resource Description Framework), and are accessible via standard APIs (SPARQL endpoints). While this is an important step toward developing an interoperable bioinformatics data landscape, it also creates a new set of obstacles, as it is often difficult for researchers to find the datasets they need. Different providers frequently offer the same datasets, with different levels of support: as well as having more or less up-to-date data, some providers add metadata to describe the content, structures, and ontologies of the stored datasets while others do not. We currently lack a place where researchers can go to easily assess datasets from different providers in terms of metrics such as service stability or metadata richness. We also lack a space for collecting feedback and improving data providers’ awareness of user needs. To address this issue, we have developed YummyData, which consists of two components. One periodically polls a curated list of SPARQL endpoints, monitoring the states of their Linked Data implementations and content. The other presents the information measured for the endpoints and provides a forum for discussion and feedback. YummyData is designed to improve the findability and reusability of life science datasets provided as Linked Data and to foster its adoption. It is freely accessible at http://yummydata.org/. Database URL: http://yummydata.org/

摘要

许多生命科学数据集现在都可以通过链接数据技术获得，这意味着它们以一种通用的格式（资源描述框架）表示，并可以通过标准 API（SPARQL 端点）访问。虽然这是朝着开发可互操作的生物信息学数据环境迈出的重要一步，但它也带来了一系列新的障碍，因为研究人员通常很难找到他们需要的数据集。不同的提供者经常提供相同的数据集，但支持的程度不同：除了数据更新程度不同之外，一些提供者添加元数据来描述存储数据集的内容、结构和本体，而另一些提供者则不添加。目前，我们缺乏一个可以轻松评估来自不同提供者的数据集的地方，这些数据集的指标包括服务稳定性或元数据丰富度等。我们还缺乏一个收集反馈和提高数据提供者对用户需求的认识的空间。为了解决这个问题，我们开发了 YummyData，它由两个组件组成。一个组件定期轮询一个经过策划的 SPARQL 端点列表，监控其链接数据实现和内容的状态。另一个组件展示为端点测量的信息，并提供一个讨论和反馈的论坛。YummyData 旨在提高作为链接数据提供的生命科学数据集的可发现性和可重用性，并促进其采用。它可以在 http://yummydata.org/ 免费访问。数据库 URL：http://yummydata.org/

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df2a/5846286/49d52f198157/bay022f1.jpg

相似文献

YummyData: providing high-quality open life science data.

Database (Oxford). 2018 Jan 1;2018. doi: 10.1093/database/bay022.

BioSharing: curated and crowd-sourced metadata standards, databases and data policies in the life sciences.

Database (Oxford). 2016 May 17;2016. doi: 10.1093/database/baw075. Print 2016.

FAIR-compliant clinical, radiomics and DICOM metadata of RIDER, interobserver, Lung1 and head-Neck1 TCIA collections.

Med Phys. 2020 Nov;47(11):5931-5940. doi: 10.1002/mp.14322. Epub 2020 Jun 27.

GEOMetaCuration: a web-based application for accurate manual curation of Gene Expression Omnibus metadata.

Database (Oxford). 2018 Jan 1;2018. doi: 10.1093/database/bay019.

Maximizing the reusability of gene expression data by predicting missing metadata.

PLoS Comput Biol. 2020 Nov 6;16(11):e1007450. doi: 10.1371/journal.pcbi.1007450. eCollection 2020 Nov.

FAIR-Checker: supporting digital resource findability and reuse with Knowledge Graphs and Semantic Web standards.

J Biomed Semantics. 2023 Jul 1;14(1):7. doi: 10.1186/s13326-023-00289-5.

Improving reusability along the data life cycle: a regulatory circuits case study.

J Biomed Semantics. 2022 Mar 28;13(1):11. doi: 10.1186/s13326-022-00266-4.

SKIOME Project: a curated collection of skin microbiome datasets enriched with study-related metadata.

Database (Oxford). 2022 May 16;2022. doi: 10.1093/database/baac033.

BioFed: federated query processing over life sciences linked open data.

J Biomed Semantics. 2017 Mar 15;8(1):13. doi: 10.1186/s13326-017-0118-0.

OBO Foundry in 2021: operationalizing open data principles to evaluate ontologies.

Database (Oxford). 2021 Oct 26;2021. doi: 10.1093/database/baab069.

引用本文的文献

A framework for integrating biomedical knowledge in Wikidata with open biological and biomedical ontologies and MeSH keywords.

Heliyon. 2024 Sep 27;10(19):e38448. doi: 10.1016/j.heliyon.2024.e38448. eCollection 2024 Oct 15.

The IDSM mass spectrometry extension: searching mass spectra using SPARQL.

Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae174.

The SIB Swiss Institute of Bioinformatics Semantic Web of data.

Nucleic Acids Res. 2024 Jan 5;52(D1):D44-D51. doi: 10.1093/nar/gkad902.

Semantic Data Visualisation for Biomedical Database Catalogues.

Healthcare (Basel). 2022 Nov 15;10(11):2287. doi: 10.3390/healthcare10112287.

Providing Adverse Outcome Pathways from the AOP-Wiki in a Semantic Web Format to Increase Usability and Accessibility of the Content.

Appl In Vitro Toxicol. 2022 Mar 1;8(1):2-13. doi: 10.1089/aivt.2021.0010. Epub 2022 Mar 17.

OMA orthology in 2021: website overhaul, conserved isoforms, ancestral gene order and more.

Nucleic Acids Res. 2021 Jan 8;49(D1):D373-D379. doi: 10.1093/nar/gkaa1007.

FAIR adoption, assessment and challenges at UniProt.

Sci Data. 2019 Sep 20;6(1):175. doi: 10.1038/s41597-019-0180-9.

本文引用的文献

The 24th annual Nucleic Acids Research database issue: a look back and upcoming changes.

Nucleic Acids Res. 2017 Jan 4;45(D1):D1-D11. doi: 10.1093/nar/gkw1188.

UniProt: the universal protein knowledgebase.

Nucleic Acids Res. 2017 Jan 4;45(D1):D158-D169. doi: 10.1093/nar/gkw1099. Epub 2016 Nov 29.

Open PHACTS computational protocols for target validation of cellular phenotypic screens: knowing the knowns.

Medchemcomm. 2016 Jun 1;7(6):1237-1244. doi: 10.1039/c6md00065g. Epub 2016 May 11.

FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation.

J Biomed Semantics. 2016 Jun 13;7:39. doi: 10.1186/s13326-016-0067-z.

The FAIR Guiding Principles for scientific data management and stewardship.

Sci Data. 2016 Mar 15;3:160018. doi: 10.1038/sdata.2016.18.

Transforming the Medical Subject Headings into Linked Data: Creating the Authorized Version of MeSH in RDF.

J Libr Metadata. 2015;15(3-4):157-176. doi: 10.1080/19386389.2015.1099967. Epub 2016 Jan 25.

The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases.

Nucleic Acids Res. 2016 Jan 4;44(D1):D471-80. doi: 10.1093/nar/gkv1164. Epub 2015 Nov 2.

PubChemRDF: towards the semantic annotation of PubChem compound and substance databases.

J Cheminform. 2015 Jul 14;7:34. doi: 10.1186/s13321-015-0084-4. eCollection 2015.

BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains.

J Biomed Semantics. 2014 Feb 5;5(1):5. doi: 10.1186/2041-1480-5-5.

The EBI RDF platform: linked open data for the life sciences.

Bioinformatics. 2014 May 1;30(9):1338-9. doi: 10.1093/bioinformatics/btt765. Epub 2014 Jan 11.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

育盟数据：提供高质量的开放生命科学数据。

YummyData: providing high-quality open life science data.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献