European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom.
Open Targets, Wellcome Genome Campus, Hinxton, United Kingdom.
PLoS Comput Biol. 2018 Jan 29;14(1):e1005968. doi: 10.1371/journal.pcbi.1005968. eCollection 2018 Jan.
Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.
Reactome 是一个免费的、开源的、开放数据的、经过同行评审的生物分子途径知识库。它的主要重点之一是提供方便、高效地访问其高质量的经过整理的数据。目前,生物途径数据库通常将其内容存储在关系数据库中。这限制了访问效率,因为与查询遍历高度互联的数据相关联的存在性能问题。图数据库中的相同数据可以更有效地查询。在这里,我们介绍了采用图数据库(Neo4j)的基本原理,以及提供对这些数据访问的新 ContentService(REST API)。Neo4j 图数据库及其查询语言 Cypher 提供了对复杂 Reactome 数据模型的高效访问,便于轻松遍历和知识发现。采用这项技术大大提高了查询效率,平均查询时间减少了 93%。构建在图数据库之上的 Web 服务通过面向对象的查询提供对 Reactome 数据的编程访问,但也支持利用新的基于图的数据存储的更复杂的查询。通过采用图数据库技术,我们为社区提供了高性能的途径数据资源。Reactome 图数据库用例展示了 NoSQL 数据库引擎对于复杂生物数据类型的强大功能。