Lopes Pedro, Oliveira José Luís
DETI/IEETA, Universidade de Aveiro, Campus Universitario de Santiago, Aveiro, 3810-193, Portugal.
BMC Bioinformatics. 2015 Oct 13;16:328. doi: 10.1186/s12859-015-0761-3.
In recent years data integration has become an everyday undertaking for life sciences researchers. Aggregating and processing data from disparate sources, whether through specific developed software or via manual processes, is a common task for scientists. However, the scope and usability of the majority of current integration tools fail to deal with the fast growing and highly dynamic nature of biomedical data.
In this work we introduce a reactive and event-driven framework that simplifies real-time data integration and interoperability. This platform facilitates otherwise difficult tasks, such as connecting heterogeneous services, indexing, linking and transferring data from distinct resources, or subscribing to notifications regarding the timeliness of dynamic data. For developers, the framework automates the deployment of integrative and interoperable bioinformatics applications, using atomic data storage for content change detection, and enabling agent-based intelligent extract, transform and load tasks.
This work bridges the gap between the growing number of services, accessing specific data sources or algorithms, and the growing number of users, performing simple integration tasks on a recurring basis, through a streamlined workspace available to researchers and developers alike.
近年来,数据整合已成为生命科学研究人员的日常工作。无论是通过专门开发的软件还是手动流程,汇总和处理来自不同来源的数据,都是科学家们的常见任务。然而,大多数当前整合工具的范围和可用性无法应对生物医学数据快速增长和高度动态的特性。
在这项工作中,我们引入了一个反应式和事件驱动的框架,该框架简化了实时数据整合和互操作性。这个平台便于完成原本困难的任务,比如连接异构服务、索引、链接和传输来自不同资源的数据,或者订阅有关动态数据及时性的通知。对于开发者而言,该框架利用原子数据存储进行内容变化检测,并启用基于代理的智能提取、转换和加载任务,从而自动部署整合且可互操作的生物信息学应用程序。
这项工作通过研究人员和开发者均可使用的简化工作区,弥合了越来越多访问特定数据源或算法的服务与越来越多定期执行简单整合任务的用户之间的差距。