BMC Genomics. 2013 Oct 25;14 Suppl 6(Suppl 6):S2. doi: 10.1186/1471-2164-14-S6-S2.
The study and analysis of gene expression measurements is the primary focus of functional genomics. Once expression data is available, biologists are faced with the task of extracting (new) knowledge associated to the underlying biological phenomenon. Most often, in order to perform this task, biologists execute a number of analysis activities on the available gene expression dataset rather than a single analysis activity. The integration of heterogeneous tools and data sources to create an integrated analysis environment represents a challenging and error-prone task. Semantic integration enables the assignment of unambiguous meanings to data shared among different applications in an integrated environment, allowing the exchange of data in a semantically consistent and meaningful way. This work aims at developing an ontology-based methodology for the semantic integration of gene expression analysis tools and data sources. The proposed methodology relies on software connectors to support not only the access to heterogeneous data sources but also the definition of transformation rules on exchanged data.
We have studied the different challenges involved in the integration of computer systems and the role software connectors play in this task. We have also studied a number of gene expression technologies, analysis tools and related ontologies in order to devise basic integration scenarios and propose a reference ontology for the gene expression domain. Then, we have defined a number of activities and associated guidelines to prescribe how the development of connectors should be carried out. Finally, we have applied the proposed methodology in the construction of three different integration scenarios involving the use of different tools for the analysis of different types of gene expression data.
The proposed methodology facilitates the development of connectors capable of semantically integrating different gene expression analysis tools and data sources. The methodology can be used in the development of connectors supporting both simple and nontrivial processing requirements, thus assuring accurate data exchange and information interpretation from exchanged data.
基因表达测量的研究和分析是功能基因组学的主要关注点。一旦获得表达数据,生物学家就面临着提取与潜在生物现象相关的(新)知识的任务。为了执行此任务,生物学家通常在可用的基因表达数据集上执行多个分析活动,而不是单个分析活动。将异构工具和数据源集成到一个集成分析环境中代表着一项具有挑战性且容易出错的任务。语义集成使我们能够为集成环境中不同应用程序之间共享的数据赋予明确的含义,从而以语义一致且有意义的方式交换数据。这项工作旨在开发一种基于本体的方法,用于语义集成基因表达分析工具和数据源。所提出的方法依赖于软件连接器来不仅支持对异构数据源的访问,还支持对交换数据定义转换规则。
我们研究了集成计算机系统所涉及的不同挑战,以及软件连接器在这项任务中的作用。我们还研究了许多基因表达技术、分析工具和相关本体,以设计基本的集成场景,并提出基因表达领域的参考本体。然后,我们定义了一些活动和相关指南,以规定如何开展连接器的开发。最后,我们将所提出的方法应用于三个不同的集成场景的构建中,这些场景涉及使用不同的工具来分析不同类型的基因表达数据。
所提出的方法有助于开发能够语义集成不同基因表达分析工具和数据源的连接器。该方法可用于开发支持简单和复杂处理要求的连接器,从而确保从交换数据中准确地进行数据交换和信息解释。