Louie Brenton, Mork Peter, Martin-Sanchez Fernando, Halevy Alon, Tarczy-Hornoch Peter
Department of Medical Education and Biomedical Informatics, University of Washington, Seattle, USA.
J Biomed Inform. 2007 Feb;40(1):5-16. doi: 10.1016/j.jbi.2006.02.007. Epub 2006 Mar 9.
Genomic medicine aims to revolutionize health care by applying our growing understanding of the molecular basis of disease. Research in this arena is data intensive, which means data sets are large and highly heterogeneous. To create knowledge from data, researchers must integrate these large and diverse data sets. This presents daunting informatic challenges such as representation of data that is suitable for computational inference (knowledge representation), and linking heterogeneous data sets (data integration). Fortunately, many of these challenges can be classified as data integration problems, and technologies exist in the area of data integration that may be applied to these challenges. In this paper, we discuss the opportunities of genomic medicine as well as identify the informatics challenges in this domain. We also review concepts and methodologies in the field of data integration. These data integration concepts and methodologies are then aligned with informatics challenges in genomic medicine and presented as potential solutions. We conclude this paper with challenges still not addressed in genomic medicine and gaps that remain in data integration research to facilitate genomic medicine.
基因组医学旨在通过应用我们对疾病分子基础日益加深的理解来彻底改变医疗保健。该领域的研究数据密集,这意味着数据集规模庞大且高度异质。为了从数据中创造知识,研究人员必须整合这些庞大且多样的数据集。这带来了艰巨的信息学挑战,例如适合计算推理的数据表示(知识表示)以及链接异质数据集(数据整合)。幸运的是,其中许多挑战可归类为数据整合问题,并且数据整合领域存在可应用于这些挑战的技术。在本文中,我们讨论基因组医学的机遇,并识别该领域的信息学挑战。我们还回顾数据整合领域的概念和方法。然后将这些数据整合概念和方法与基因组医学中的信息学挑战相结合,并作为潜在解决方案呈现。我们在本文结尾指出基因组医学中仍未解决的挑战以及数据整合研究中仍存在的差距,以促进基因组医学发展。