Suppr超能文献

BioCarian:用于在异构生物数据库中进行探索性搜索的搜索引擎。

BioCarian: search engine for exploratory searches in heterogeneous biological databases.

作者信息

Zaki Nazar, Tennakoon Chandana

机构信息

Department of Comp. Science and Software Engineering, College of Info. Technology, United Arab Emirates University (UAEU), Al Ain, PO Box 15551, United Arab Emirates.

出版信息

BMC Bioinformatics. 2017 Oct 2;18(1):435. doi: 10.1186/s12859-017-1840-4.

Abstract

BACKGROUND

There are a large number of biological databases publicly available for scientists in the web. Also, there are many private databases generated in the course of research projects. These databases are in a wide variety of formats. Web standards have evolved in the recent times and semantic web technologies are now available to interconnect diverse and heterogeneous sources of data. Therefore, integration and querying of biological databases can be facilitated by techniques used in semantic web. Heterogeneous databases can be converted into Resource Description Format (RDF) and queried using SPARQL language. Searching for exact queries in these databases is trivial. However, exploratory searches need customized solutions, especially when multiple databases are involved. This process is cumbersome and time consuming for those without a sufficient background in computer science. In this context, a search engine facilitating exploratory searches of databases would be of great help to the scientific community.

RESULTS

We present BioCarian, an efficient and user-friendly search engine for performing exploratory searches on biological databases. The search engine is an interface for SPARQL queries over RDF databases. We note that many of the databases can be converted to tabular form. We first convert the tabular databases to RDF. The search engine provides a graphical interface based on facets to explore the converted databases. The facet interface is more advanced than conventional facets. It allows complex queries to be constructed, and have additional features like ranking of facet values based on several criteria, visually indicating the relevance of a facet value and presenting the most important facet values when a large number of choices are available. For the advanced users, SPARQL queries can be run directly on the databases. Using this feature, users will be able to incorporate federated searches of SPARQL endpoints. We used the search engine to do an exploratory search on previously published viral integration data and were able to deduce the main conclusions of the original publication. BioCarian is accessible via http://www.biocarian.com .

CONCLUSIONS

We have developed a search engine to explore RDF databases that can be used by both novice and advanced users.

摘要

背景

网络上有大量可供科学家使用的生物数据库。此外,在研究项目过程中还会生成许多私有数据库。这些数据库格式多样。近年来网络标准不断发展,现在语义网技术可用于互连各种不同和异构的数据来源。因此,语义网技术可促进生物数据库的集成与查询。异构数据库可转换为资源描述框架(RDF)并使用SPARQL语言进行查询。在这些数据库中进行精确查询很简单。然而,探索性搜索需要定制解决方案,尤其是涉及多个数据库时。对于没有足够计算机科学背景的人来说,这个过程既繁琐又耗时。在这种情况下,一个便于对数据库进行探索性搜索的搜索引擎将对科学界有很大帮助。

结果

我们展示了BioCarian,这是一个用于在生物数据库上进行探索性搜索的高效且用户友好的搜索引擎。该搜索引擎是一个用于对RDF数据库进行SPARQL查询的接口。我们注意到许多数据库可以转换为表格形式。我们首先将表格数据库转换为RDF。该搜索引擎提供了一个基于方面的图形界面来探索转换后的数据库。该方面界面比传统方面更先进。它允许构建复杂查询,并具有诸如根据多个标准对面值进行排名、直观显示面值的相关性以及在有大量选择时呈现最重要面值等附加功能。对于高级用户,可以直接在数据库上运行SPARQL查询。利用此功能,用户将能够合并对SPARQL端点的联合搜索。我们使用该搜索引擎对先前发表的病毒整合数据进行了探索性搜索,并能够推断出原始出版物的主要结论。可通过http://www.biocarian.com访问BioCarian。

结论

我们开发了一个用于探索RDF数据库的搜索引擎,新手和高级用户均可使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50c2/5625622/2e43f79ebcf3/12859_2017_1840_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验