Lancaster Environment Centre, Lancaster University, Library Avenue, Lancaster, LA1 4YQ, UK.
Centre for Health Informatics, Computing and Statistics (CHICAS), Lancaster Medical School, Faculty of Health and Medicine, Furness Building, Lancaster University, Lancaster, LA1 4YQ, UK.
Gigascience. 2020 May 1;9(5). doi: 10.1093/gigascience/giaa039.
The exponential accumulation of environmental and ecological data together with the adoption of open data initiatives bring opportunities and challenges for integrating and synthesising relevant knowledge that need to be addressed, given the ongoing environmental crises.
Here we present Biospytial, a modular open source knowledge engine designed to import, organise, analyse and visualise big spatial ecological datasets using the power of graph theory. The engine uses a hybrid graph-relational approach to store and access information. A graph data structure uses linkage relationships to build semantic structures represented as complex data structures stored in a graph database, while tabular and geospatial data are stored in an efficient spatial relational database system. We provide an application using information on species occurrences, their taxonomic classification and climatic datasets. We built a knowledge graph of the Tree of Life embedded in an environmental and geographical grid to perform an analysis on threatened species co-occurring with jaguars (Panthera onca).
The Biospytial approach reduces the complexity of joining datasets using multiple tabular relations, while its scalable design eases the problem of merging datasets from different sources. Its modular design makes it possible to distribute several instances simultaneously, allowing fast and efficient handling of big ecological datasets. The provided example demonstrates the engine's capabilities in performing basic graph manipulation, analysis and visualizations of taxonomic groups co-occurring in space. The example shows potential avenues for performing novel ecological analyses, biodiversity syntheses and species distribution models aided by a network of taxonomic and spatial relationships.
环境和生态数据的指数级积累,加上开放数据倡议的采用,为整合和综合相关知识带来了机遇和挑战,而这些知识需要应对当前的环境危机。
在这里,我们提出了 Biospytial,这是一个模块化的开源知识引擎,旨在利用图论的强大功能来导入、组织、分析和可视化大型空间生态数据集。该引擎采用混合图关系方法来存储和访问信息。图数据结构使用链接关系来构建表示为存储在图数据库中的复杂数据结构的语义结构,而表格和地理空间数据存储在高效的空间关系数据库系统中。我们提供了一个应用程序,使用物种出现、分类学分类和气候数据集的信息。我们构建了一个嵌入在环境和地理网格中的生命之树知识图,以分析与美洲虎(Panthera onca)共同出现的受威胁物种。
Biospytial 方法通过使用多个表格关系来减少连接数据集的复杂性,同时其可扩展的设计简化了来自不同来源的数据集合并的问题。其模块化设计使得同时分发多个实例成为可能,从而可以快速有效地处理大型生态数据集。提供的示例演示了该引擎在执行基本图操作、空间上共同出现的分类群的分析和可视化方面的功能。该示例展示了通过分类和空间关系网络执行新的生态分析、生物多样性综合和物种分布模型的潜在途径。