Database Center for Life Science, Research Organization of Information and Systems, Kashiwa, Japan.
Novartis Institutes for Biomedical Research, Basel, Switzerland.
Database (Oxford). 2018 Jan 1;2018. doi: 10.1093/database/bay022.
Many life science datasets are now available via Linked Data technologies, meaning that they are represented in a common format (the Resource Description Framework), and are accessible via standard APIs (SPARQL endpoints). While this is an important step toward developing an interoperable bioinformatics data landscape, it also creates a new set of obstacles, as it is often difficult for researchers to find the datasets they need. Different providers frequently offer the same datasets, with different levels of support: as well as having more or less up-to-date data, some providers add metadata to describe the content, structures, and ontologies of the stored datasets while others do not. We currently lack a place where researchers can go to easily assess datasets from different providers in terms of metrics such as service stability or metadata richness. We also lack a space for collecting feedback and improving data providers’ awareness of user needs. To address this issue, we have developed YummyData, which consists of two components. One periodically polls a curated list of SPARQL endpoints, monitoring the states of their Linked Data implementations and content. The other presents the information measured for the endpoints and provides a forum for discussion and feedback. YummyData is designed to improve the findability and reusability of life science datasets provided as Linked Data and to foster its adoption. It is freely accessible at http://yummydata.org/. Database URL: http://yummydata.org/
许多生命科学数据集现在都可以通过链接数据技术获得,这意味着它们以一种通用的格式(资源描述框架)表示,并可以通过标准 API(SPARQL 端点)访问。虽然这是朝着开发可互操作的生物信息学数据环境迈出的重要一步,但它也带来了一系列新的障碍,因为研究人员通常很难找到他们需要的数据集。不同的提供者经常提供相同的数据集,但支持的程度不同:除了数据更新程度不同之外,一些提供者添加元数据来描述存储数据集的内容、结构和本体,而另一些提供者则不添加。目前,我们缺乏一个可以轻松评估来自不同提供者的数据集的地方,这些数据集的指标包括服务稳定性或元数据丰富度等。我们还缺乏一个收集反馈和提高数据提供者对用户需求的认识的空间。为了解决这个问题,我们开发了 YummyData,它由两个组件组成。一个组件定期轮询一个经过策划的 SPARQL 端点列表,监控其链接数据实现和内容的状态。另一个组件展示为端点测量的信息,并提供一个讨论和反馈的论坛。YummyData 旨在提高作为链接数据提供的生命科学数据集的可发现性和可重用性,并促进其采用。它可以在 http://yummydata.org/ 免费访问。数据库 URL:http://yummydata.org/
Database (Oxford). 2018-1-1
Database (Oxford). 2016-5-17
Database (Oxford). 2018-1-1
PLoS Comput Biol. 2020-11
J Biomed Semantics. 2022-3-28
Database (Oxford). 2022-5-16
J Biomed Semantics. 2017-3-15
Database (Oxford). 2021-10-26
Bioinformatics. 2024-3-29
Nucleic Acids Res. 2024-1-5
Healthcare (Basel). 2022-11-15
Nucleic Acids Res. 2021-1-8
Sci Data. 2019-9-20
Nucleic Acids Res. 2017-1-4
Nucleic Acids Res. 2017-1-4
J Biomed Semantics. 2016-6-13
Sci Data. 2016-3-15
J Cheminform. 2015-7-14
J Biomed Semantics. 2014-2-5
Bioinformatics. 2014-1-11