用于在转化方法中支持生物分子和临床数据集成的计算框架。

Computational framework to support integration of biomolecular and clinical data within a translational approach.

机构信息

Department of Computing and Mathematics, Faculty of Philosophy, Sciences and Languages of Ribeirão Preto, University of São Paulo, São Paulo, Brazil.

出版信息

BMC Bioinformatics. 2013 Jun 6;14:180. doi: 10.1186/1471-2105-14-180.

DOI:10.1186/1471-2105-14-180

PMID:23742129

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3688149/

Abstract

BACKGROUND

The use of the knowledge produced by sciences to promote human health is the main goal of translational medicine. To make it feasible we need computational methods to handle the large amount of information that arises from bench to bedside and to deal with its heterogeneity. A computational challenge that must be faced is to promote the integration of clinical, socio-demographic and biological data. In this effort, ontologies play an essential role as a powerful artifact for knowledge representation. Chado is a modular ontology-oriented database model that gained popularity due to its robustness and flexibility as a generic platform to store biological data; however it lacks supporting representation of clinical and socio-demographic information.

RESULTS

We have implemented an extension of Chado - the Clinical Module - to allow the representation of this kind of information. Our approach consists of a framework for data integration through the use of a common reference ontology. The design of this framework has four levels: data level, to store the data; semantic level, to integrate and standardize the data by the use of ontologies; application level, to manage clinical databases, ontologies and data integration process; and web interface level, to allow interaction between the user and the system. The clinical module was built based on the Entity-Attribute-Value (EAV) model. We also proposed a methodology to migrate data from legacy clinical databases to the integrative framework. A Chado instance was initialized using a relational database management system. The Clinical Module was implemented and the framework was loaded using data from a factual clinical research database. Clinical and demographic data as well as biomaterial data were obtained from patients with tumors of head and neck. We implemented the IPTrans tool that is a complete environment for data migration, which comprises: the construction of a model to describe the legacy clinical data, based on an ontology; the Extraction, Transformation and Load (ETL) process to extract the data from the source clinical database and load it in the Clinical Module of Chado; the development of a web tool and a Bridge Layer to adapt the web tool to Chado, as well as other applications.

CONCLUSIONS

Open-source computational solutions currently available for translational science does not have a model to represent biomolecular information and also are not integrated with the existing bioinformatics tools. On the other hand, existing genomic data models do not represent clinical patient data. A framework was developed to support translational research by integrating biomolecular information coming from different "omics" technologies with patient's clinical and socio-demographic data. This framework should present some features: flexibility, compression and robustness. The experiments accomplished from a use case demonstrated that the proposed system meets requirements of flexibility and robustness, leading to the desired integration. The Clinical Module can be accessed in http://dcm.ffclrp.usp.br/caib/pg=iptrans.

摘要

背景

将科学知识应用于促进人类健康是转化医学的主要目标。为了使其成为现实，我们需要计算方法来处理从基础到临床产生的大量信息，并处理其异质性。一个必须面对的计算挑战是促进临床、社会人口统计学和生物学数据的整合。在这方面，本体论作为知识表示的强大工具起着至关重要的作用。Chado 是一种模块化的面向本体论的数据库模型，由于其作为存储生物数据的通用平台的稳健性和灵活性而广受欢迎；然而，它缺乏对临床和社会人口统计学信息的支持表示。

结果

我们已经实现了 Chado 的扩展——临床模块，以允许表示这种信息。我们的方法包括使用公共参考本体进行数据集成的框架。该框架的设计有四个层次：数据层，用于存储数据；语义层，通过使用本体集成和标准化数据；应用程序层，用于管理临床数据库、本体和数据集成过程；以及 Web 接口层，允许用户与系统交互。临床模块基于实体-属性-值（EAV）模型构建。我们还提出了一种从遗留临床数据库迁移数据到集成框架的方法。使用关系数据库管理系统初始化了 Chado 实例。实现了临床模块并使用来自实际临床研究数据库的数据加载了框架。从头颈部肿瘤患者那里获得了临床和人口统计学数据以及生物材料数据。我们实现了 IPTrans 工具，这是一个用于数据迁移的完整环境，包括：构建基于本体描述遗留临床数据的模型；从源临床数据库中提取数据并将其加载到 Chado 的临床模块中的 ETL 过程；开发一个 Web 工具和一个桥接层，以使 Web 工具适应 Chado 以及其他应用程序。

结论

目前可用于转化科学的开源计算解决方案没有表示生物分子信息的模型，也没有与现有的生物信息学工具集成。另一方面，现有的基因组数据模型不表示临床患者数据。开发了一个框架来通过整合来自不同“组学”技术的生物分子信息以及患者的临床和社会人口统计学数据来支持转化研究。该框架应具有一些特征：灵活性、压缩性和稳健性。从一个用例的实验表明，所提出的系统满足灵活性和稳健性的要求，从而实现了所需的集成。可以在 http://dcm.ffclrp.usp.br/caib/pg=iptrans 访问临床模块。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba78/3688149/2992f80c04f7/1471-2105-14-180-1.jpg

相似文献

Computational framework to support integration of biomolecular and clinical data within a translational approach.用于在转化方法中支持生物分子和临床数据集成的计算框架。

BMC Bioinformatics. 2013 Jun 6;14:180. doi: 10.1186/1471-2105-14-180.

A Chado case study: an ontology-based modular schema for representing genome-associated biological information.一个Chado案例研究：用于表示基因组相关生物信息的基于本体的模块化模式。

Bioinformatics. 2007 Jul 1;23(13):i337-46. doi: 10.1093/bioinformatics/btm189.

A semantic proteomics dashboard (SemPoD) for data management in translational research.用于转化研究数据管理的语义蛋白质组学仪表板（SemPoD）。

BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S20. doi: 10.1186/1752-0509-6-S3-S20. Epub 2012 Dec 17.

The Chado Natural Diversity module: a new generic database schema for large-scale phenotyping and genotyping data.Chado 自然多样性模块：一个用于大规模表型和基因型数据的新型通用数据库模式。

Database (Oxford). 2011 Nov 26;2011:bar051. doi: 10.1093/database/bar051. Print 2011.

Atlas - a data warehouse for integrative bioinformatics.阿特拉斯——一个用于整合生物信息学的数据仓库。

BMC Bioinformatics. 2005 Feb 21;6:34. doi: 10.1186/1471-2105-6-34.

An ICT infrastructure to integrate clinical and molecular data in oncology research.一种整合肿瘤学研究中临床和分子数据的 ICT 基础设施。

BMC Bioinformatics. 2012 Mar 28;13 Suppl 4(Suppl 4):S5. doi: 10.1186/1471-2105-13-S4-S5.

GIDL: a rule based expert system for GenBank Intelligent Data Loading into the Molecular Biodiversity Database.GIDL：一个基于规则的专家系统，用于将 GenBank 智能数据加载到分子生物多样性数据库中。

BMC Bioinformatics. 2012 Mar 28;13 Suppl 4(Suppl 4):S4. doi: 10.1186/1471-2105-13-S4-S4.

OpenFlyData: an exemplar data web integrating gene expression data on the fruit fly Drosophila melanogaster.OpenFlyData：一个整合了果蝇基因表达数据的数据网络范例。

J Biomed Inform. 2010 Oct;43(5):752-61. doi: 10.1016/j.jbi.2010.04.004.

BioBankWarden: A web-based system to support translational cancer research by managing clinical and biomaterial data.生物样本库管理员：一个通过管理临床和生物材料数据来支持转化癌症研究的基于网络的系统。

Comput Biol Med. 2017 May 1;84:254-261. doi: 10.1016/j.compbiomed.2015.04.008. Epub 2015 Apr 15.

Machado: Open source genomics data integration framework.马查多：开源基因组学数据集成框架。

Gigascience. 2020 Sep 14;9(9). doi: 10.1093/gigascience/giaa097.

引用本文的文献

What You Need to Know Before Implementing a Clinical Research Data Warehouse: Comparative Review of Integrated Data Repositories in Health Care Institutions.实施临床研究数据仓库之前你需要了解的内容：医疗机构综合数据存储库的比较综述

JMIR Form Res. 2020 Aug 27;4(8):e17687. doi: 10.2196/17687.

Inflammation Thread Runs across Medical Laboratory Specialities.炎症贯穿医学检验各专业。

Mediators Inflamm. 2016;2016:4121837. doi: 10.1155/2016/4121837. Epub 2016 Jul 14.

LEAFDATA: a literature-curated database for Arabidopsis leaf development.LEAFDATA：一个用于拟南芥叶片发育的文献精选数据库。

Plant Methods. 2016 Feb 15;12:15. doi: 10.1186/s13007-016-0115-9. eCollection 2016.

Review and evaluation of electronic health records-driven phenotype algorithm authoring tools for clinical and translational research.用于临床和转化研究的电子健康记录驱动的表型算法创作工具的综述与评估

J Am Med Inform Assoc. 2015 Nov;22(6):1251-60. doi: 10.1093/jamia/ocv070. Epub 2015 Jul 29.

本文引用的文献

IBDsite: a Galaxy-interacting, integrative database for supporting inflammatory bowel disease high throughput data analysis.IBDsite：一个 Galaxy 相互作用的、综合性的数据库，用于支持炎症性肠病高通量数据分析。

BMC Bioinformatics. 2012;13 Suppl 14(Suppl 14):S5. doi: 10.1186/1471-2105-13-S14-S5. Epub 2012 Sep 7.

IPAD: the Integrated Pathway Analysis Database for Systematic Enrichment Analysis.IPAD：系统富集分析的综合途径分析数据库。

BMC Bioinformatics. 2012;13 Suppl 15(Suppl 15):S7. doi: 10.1186/1471-2105-13-S15-S7. Epub 2012 Sep 11.

CerealsDB 2.0: an integrated resource for plant breeders and scientists.谷物数据库 2.0：植物育种家和科学家的综合资源。

BMC Bioinformatics. 2012 Sep 3;13:219. doi: 10.1186/1471-2105-13-219.

Database (Oxford). 2011 Nov 26;2011:bar051. doi: 10.1093/database/bar051. Print 2011.

Research-IQ: development and evaluation of an ontology-anchored integrative query tool.Research-IQ：一种基于本体论的集成查询工具的开发和评估。

J Biomed Inform. 2011 Dec;44 Suppl 1(Suppl 1):S56-S62. doi: 10.1016/j.jbi.2011.07.006. Epub 2011 Jul 29.

The Translational Medicine Ontology and Knowledge Base: driving personalized medicine by bridging the gap between bench and bedside.转化医学本体与知识库：通过弥合实验室与临床之间的差距推动个性化医疗。

J Biomed Semantics. 2011 May 17;2 Suppl 2(Suppl 2):S1. doi: 10.1186/2041-1480-2-S2-S1.

CDAO-store: ontology-driven data integration for phylogenetic analysis.CDAO-store：基于本体的系统发育分析数据集成。

BMC Bioinformatics. 2011 Apr 15;12:98. doi: 10.1186/1471-2105-12-98.

Ontology-anchored Approaches to Conceptual Knowledge Discovery in a Multi-dimensional Research Data Repository.在多维研究数据存储库中基于本体的概念知识发现方法。

Summit Transl Bioinform. 2008 Mar 1;2008:85-9.

ParameciumDB in 2011: new tools and new data for functional and comparative genomics of the model ciliate Paramecium tetraurelia.2011年的草履虫数据库：用于模式纤毛虫四膜虫功能和比较基因组学的新工具与新数据。

Nucleic Acids Res. 2011 Jan;39(Database issue):D632-6. doi: 10.1093/nar/gkq918. Epub 2010 Oct 14.

AphidBase: a centralized bioinformatic resource for annotation of the pea aphid genome.AphidBase：豌豆蚜基因组注释的集中生物信息资源。

Insect Mol Biol. 2010 Mar;19 Suppl 2(0 2):5-12. doi: 10.1111/j.1365-2583.2009.00930.x.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于在转化方法中支持生物分子和临床数据集成的计算框架。

Computational framework to support integration of biomolecular and clinical data within a translational approach.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献