Szpakowski Sebastian, McCusker James, Krauthammer Michael
Program for Computational Biology and Bioinformatics (CBB), Yale University School of Medicine, New Haven, CT. ; Department of Pathology, Yale University School of Medicine, New Haven, CT.
Department of Pathology, Yale University School of Medicine, New Haven, CT.
Cancer Inform. 2009 May 13;8:65-73. doi: 10.4137/cin.s2335. eCollection 2009.
In this paper, we annotate and align two different gene expression microarray designs using the Genomic ELement Ontology (GELO). GELO is a new ontology that leverages an existing community resource, Sequence Ontology (SO), to create views of genomically-aligned data in a semantic web environment. We start the process by mapping array probes to genomic coordinates. The coordinates represent an implicit link between the probes and multiple genomic elements, such as genes, transcripts, miRNA, and repetitive elements, which are represented using concepts in SO. We then use the RDF Query Language (SPARQL) to create explicit links between the probes and the elements. We show how the approach allows us to easily determine the element coverage and genomic overlap of the two array designs. We believe that the method will ultimately be useful for integration of cancer data across multiple omic studies. The ontology and other materials described in this paper are available at http://krauthammerlab.med.yale.edu/wiki/Gelo.
在本文中,我们使用基因组元件本体(GELO)对两种不同的基因表达微阵列设计进行注释和比对。GELO是一种新的本体,它利用现有的社区资源——序列本体(SO),在语义网环境中创建基因组比对数据的视图。我们通过将阵列探针映射到基因组坐标来启动这一过程。这些坐标代表了探针与多个基因组元件(如基因、转录本、miRNA和重复元件)之间的隐含联系,这些元件使用SO中的概念来表示。然后,我们使用RDF查询语言(SPARQL)在探针和元件之间创建显式链接。我们展示了该方法如何使我们能够轻松确定两种阵列设计的元件覆盖范围和基因组重叠情况。我们相信,该方法最终将有助于整合多个组学研究中的癌症数据。本文中描述的本体和其他材料可在http://krauthammerlab.med.yale.edu/wiki/Gelo获取。