Wolstencroft Katherine, Owen Stuart, Krebs Olga, Nguyen Quyen, Stanford Natalie J, Golebiewski Martin, Weidemann Andreas, Bittkowski Meik, An Lihua, Shockley David, Snoep Jacky L, Mueller Wolfgang, Goble Carole
Leiden Institute of Advanced Computer Science Leiden Institute of Advanced Computer Science, Leiden University, 111 Snellius, Niels Bohrweg 1, Leiden, CA, 2333, Netherlands.
School of Computer Science, University of Manchester, Kilburn Building, Oxford Road, Manchester, M13 9PL, UK.
BMC Syst Biol. 2015 Jul 11;9:33. doi: 10.1186/s12918-015-0174-y.
Systems biology research typically involves the integration and analysis of heterogeneous data types in order to model and predict biological processes. Researchers therefore require tools and resources to facilitate the sharing and integration of data, and for linking of data to systems biology models. There are a large number of public repositories for storing biological data of a particular type, for example transcriptomics or proteomics, and there are several model repositories. However, this silo-type storage of data and models is not conducive to systems biology investigations. Interdependencies between multiple omics datasets and between datasets and models are essential. Researchers require an environment that will allow the management and sharing of heterogeneous data and models in the context of the experiments which created them.
The SEEK is a suite of tools to support the management, sharing and exploration of data and models in systems biology. The SEEK platform provides an access-controlled, web-based environment for scientists to share and exchange data and models for day-to-day collaboration and for public dissemination. A plug-in architecture allows the linking of experiments, their protocols, data, models and results in a configurable system that is available 'off the shelf'. Tools to run model simulations, plot experimental data and assist with data annotation and standardisation combine to produce a collection of resources that support analysis as well as sharing. Underlying semantic web resources additionally extract and serve SEEK metadata in RDF (Resource Description Format). SEEK RDF enables rich semantic queries, both within SEEK and between related resources in the web of Linked Open Data.
The SEEK platform has been adopted by many systems biology consortia across Europe. It is a data management environment that has a low barrier of uptake and provides rich resources for collaboration. This paper provides an update on the functions and features of the SEEK software, and describes the use of the SEEK in the SysMO consortium (Systems biology for Micro-organisms), and the VLN (virtual Liver Network), two large systems biology initiatives with different research aims and different scientific communities.
系统生物学研究通常涉及对异构数据类型的整合与分析,以便对生物过程进行建模和预测。因此,研究人员需要工具和资源来促进数据的共享与整合,并将数据与系统生物学模型相链接。有大量用于存储特定类型生物数据(例如转录组学或蛋白质组学数据)的公共数据库,也有几个模型数据库。然而,这种数据和模型的孤岛式存储不利于系统生物学研究。多个组学数据集之间以及数据集与模型之间的相互依赖关系至关重要。研究人员需要一个能够在创建数据和模型的实验背景下管理和共享异构数据及模型的环境。
SEEK是一套支持系统生物学中数据和模型管理、共享及探索的工具。SEEK平台为科学家提供了一个基于网络的访问控制环境,用于日常协作和公开传播数据及模型。插件架构允许在一个现成可用的可配置系统中链接实验、实验方案、数据、模型和结果。运行模型模拟、绘制实验数据以及辅助数据注释和标准化的工具共同构成了一个支持分析和共享的资源集合。底层的语义网资源还以RDF(资源描述格式)提取并提供SEEK元数据。SEEK RDF支持在SEEK内部以及关联开放数据网络中的相关资源之间进行丰富的语义查询。
SEEK平台已被欧洲许多系统生物学联盟采用。它是一个数据管理环境,采用门槛低,并为协作提供了丰富的资源。本文介绍了SEEK软件的功能和特性更新,并描述了SEEK在SysMO联盟(微生物系统生物学)和VLN(虚拟肝脏网络)中的应用,这是两个具有不同研究目标和不同科学群体的大型系统生物学计划。