Gabetta Matteo, Limongelli Ivan, Rizzo Ettore, Riva Alberto, Segagni Daniele, Bellazzi Riccardo
Dipartimento di Ingegneria Industriale e dell'Informazione and Center for Health Technologies, Università di Pavia, Pavia, Italy.
Biomeris s.r.l., Pavia, Italy.
BMC Bioinformatics. 2015 Dec 29;16:415. doi: 10.1186/s12859-015-0861-0.
Precision medicine requires the tight integration of clinical and molecular data. To this end, it is mandatory to define proper technological solutions able to manage the overwhelming amount of high throughput genomic data needed to test associations between genomic signatures and human phenotypes. The i2b2 Center (Informatics for Integrating Biology and the Bedside) has developed a widely internationally adopted framework to use existing clinical data for discovery research that can help the definition of precision medicine interventions when coupled with genetic data. i2b2 can be significantly advanced by designing efficient management solutions of Next Generation Sequencing data.
We developed BigQ, an extension of the i2b2 framework, which integrates patient clinical phenotypes with genomic variant profiles generated by Next Generation Sequencing. A visual programming i2b2 plugin allows retrieving variants belonging to the patients in a cohort by applying filters on genomic variant annotations. We report an evaluation of the query performance of our system on more than 11 million variants, showing that the implemented solution scales linearly in terms of query time and disk space with the number of variants.
In this paper we describe a new i2b2 web service composed of an efficient and scalable document-based database that manages annotations of genomic variants and of a visual programming plug-in designed to dynamically perform queries on clinical and genetic data. The system therefore allows managing the fast growing volume of genomic variants and can be used to integrate heterogeneous genomic annotations.
精准医学需要临床数据和分子数据的紧密整合。为此,必须定义适当的技术解决方案,以管理测试基因组特征与人类表型之间关联所需的海量高通量基因组数据。i2b2中心(整合生物学与床边信息学)开发了一个在国际上广泛采用的框架,用于利用现有临床数据进行发现研究,当与基因数据结合时,这有助于精准医学干预措施的定义。通过设计高效的下一代测序数据管理解决方案,i2b2可以得到显著提升。
我们开发了BigQ,这是i2b2框架的一个扩展,它将患者临床表型与下一代测序产生的基因组变异谱整合在一起。一个可视化编程的i2b2插件允许通过对基因组变异注释应用过滤器来检索队列中患者的变异。我们报告了对我们系统在超过1100万个变异上的查询性能评估,结果表明所实现的解决方案在查询时间和磁盘空间方面与变异数量呈线性扩展。
在本文中,我们描述了一个新的i2b2网络服务,它由一个高效且可扩展的基于文档的数据库组成,该数据库管理基因组变异的注释,以及一个旨在对临床和基因数据动态执行查询的可视化编程插件。因此,该系统允许管理快速增长的基因组变异量,并可用于整合异构的基因组注释。