Suppr超能文献

酵母中心:生命科学领域数据整合的语义网用例。

YeastHub: a semantic web use case for integrating data in the life sciences domain.

作者信息

Cheung Kei-Hoi, Yip Kevin Y, Smith Andrew, Deknikker Remko, Masiar Andy, Gerstein Mark

机构信息

Center for Medical Informatics, Yale University New Haven, CT 06520, USA.

出版信息

Bioinformatics. 2005 Jun;21 Suppl 1:i85-96. doi: 10.1093/bioinformatics/bti1026.

Abstract

MOTIVATION

As the semantic web technology is maturing and the need for life sciences data integration over the web is growing, it is important to explore how data integration needs can be addressed by the semantic web. The main problem that we face in data integration is a lack of widely-accepted standards for expressing the syntax and semantics of the data. We address this problem by exploring the use of semantic web technologies-including resource description framework (RDF), RDF site summary (RSS), relational-database-to-RDF mapping (D2RQ) and native RDF data repository-to represent, store and query both metadata and data across life sciences datasets.

RESULTS

As many biological datasets are presently available in tabular format, we introduce an RDF structure into which they can be converted. Also, we develop a prototype web-based application called YeastHub that demonstrates how a life sciences data warehouse can be built using a native RDF data store (Sesame). This data warehouse allows integration of different types of yeast genome data provided by different resources in different formats including the tabular and RDF formats. Once the data are loaded into the data warehouse, RDF-based queries can be formulated to retrieve and query the data in an integrated fashion.

AVAILABILITY

The YeastHub website is accessible via the following URL: http://yeasthub.gersteinlab.org.

摘要

动机

随着语义网技术的成熟以及通过网络进行生命科学数据集成的需求不断增长,探索语义网如何满足数据集成需求变得至关重要。我们在数据集成中面临的主要问题是缺乏用于表达数据语法和语义的广泛接受的标准。我们通过探索使用语义网技术(包括资源描述框架(RDF)、RDF站点摘要(RSS)、关系数据库到RDF映射(D2RQ)和原生RDF数据存储库)来表示、存储和查询生命科学数据集中的元数据和数据,从而解决这个问题。

结果

由于目前许多生物数据集都是表格形式,我们引入了一种RDF结构,可将它们转换到该结构中。此外,我们开发了一个名为YeastHub的基于网络的原型应用程序,展示了如何使用原生RDF数据存储(Sesame)构建生命科学数据仓库。这个数据仓库允许整合由不同资源以不同格式(包括表格和RDF格式)提供的不同类型的酵母基因组数据。一旦数据加载到数据仓库中,就可以制定基于RDF的查询,以集成的方式检索和查询数据。

可用性

可通过以下网址访问YeastHub网站:http://yeasthub.gersteinlab.org。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验