Ezra Tsur Elishai
Neuro-Biomorphic Engineering lab, Faculty of Engineering, Jerusalem College of Technology, Jerusalem, Israel.
BioData Min. 2017 Mar 11;10:11. doi: 10.1186/s13040-017-0130-z. eCollection 2017.
Databases are imperative for research in bioinformatics and computational biology. Current challenges in database design include data heterogeneity and context-dependent interconnections between data entities. These challenges drove the development of unified data interfaces and specialized databases. The curation of specialized databases is an ever-growing challenge due to the introduction of new data sources and the emergence of new relational connections between established datasets. Here, an open-source framework for the curation of specialized databases is proposed. The framework supports user-designed models of data encapsulation, objects persistency and structured interfaces to local and external data sources such as MalaCards, Biomodels and the National Centre for Biotechnology Information (NCBI) databases. The proposed framework was implemented using Java as the development environment, EclipseLink as the data persistency agent and Apache Derby as the database manager. Syntactic analysis was based on J3D, jsoup, Apache Commons and w3c.dom open libraries. Finally, a construction of a specialized database for aneurysms associated vascular diseases is demonstrated. This database contains 3-dimensional geometries of aneurysms, patient's clinical information, articles, biological models, related diseases and our recently published model of aneurysms' risk of rapture. Framework is available in: http://nbel-lab.com.
数据库对于生物信息学和计算生物学研究至关重要。当前数据库设计面临的挑战包括数据异构性以及数据实体之间依赖上下文的相互联系。这些挑战推动了统一数据接口和专业数据库的发展。由于新数据源的引入以及现有数据集之间新关系连接的出现,专业数据库的管理成为一项日益艰巨的挑战。在此,我们提出了一个用于专业数据库管理的开源框架。该框架支持用户设计的数据封装模型、对象持久性以及到诸如MalaCards、生物模型和美国国立生物技术信息中心(NCBI)数据库等本地和外部数据源的结构化接口。所提出的框架使用Java作为开发环境、EclipseLink作为数据持久性代理以及Apache Derby作为数据库管理器来实现。句法分析基于J3D、jsoup、Apache Commons和w3c.dom开放库。最后,展示了一个针对动脉瘤相关血管疾病的专业数据库的构建。该数据库包含动脉瘤的三维几何形状、患者临床信息、文章、生物模型、相关疾病以及我们最近发表的动脉瘤破裂风险模型。框架可在以下网址获取:http://nbel-lab.com。