Marquet Gwenaëlle, Mosser Jean, Burgun Anita
EA 3888, IFR 140 - Faculté de Médecine, Université de Rennes1, France.
Int J Med Inform. 2007 Dec;76 Suppl 3:S353-61. doi: 10.1016/j.ijmedinf.2007.03.004. Epub 2007 May 22.
The OBO ontologies include more than 50 standard vocabularies that cover different domains, including genomics, chemistry, anatomy and phenotype. Ontology alignment is a means to build consistent biomedical ontologies compatible with standard vocabularies and dedicated to specific domains, such as cancer. An alignment is defined as a set of pairs of concepts, coming from two ontologies, related by a relation R, R not being restricted to the equivalence or subsumption relations. Alignment is performed in three major steps: first, the concepts that are equivalent in the ontologies are identified; second the pairs of concepts that are related although not equivalent are searched for; third the relations between the concepts are characterized. We have developed a method to align ontologies that exploits the compositionality of the terms in OBO ontologies, uses the UMLS to provide synonyms and relations, and defines syntactico-semantic patterns that characterize semantically the relations between concepts. We have applied it to four OBO phenotype ontologies: mouse pathology, human disease, mammalian phenotype, and PATO. We found 386 pairs of equivalent concepts and 20,461 pairs of concepts where one concept name is included in the other term. Among the 20,460 inclusions, we were able to provide a semantic categorization for 2682 relations. In 2552 cases, the relation was present and semantically defined in the UMLS Metathesaurus, in 131 cases the relation was characterized through semantic patterns. Our approach may help to find the semantic relations between concepts in ontologies.
OBO本体包括50多个标准词汇表,涵盖不同领域,包括基因组学、化学、解剖学和表型。本体对齐是构建与标准词汇表兼容且专用于特定领域(如癌症)的一致生物医学本体的一种手段。对齐被定义为来自两个本体的一对概念集合,通过关系R相关联,R不限于等价或包含关系。对齐过程主要分三个步骤进行:首先,识别本体中相等的概念;其次,寻找虽不等价但相关的概念对;第三,刻画概念之间的关系。我们开发了一种本体对齐方法,该方法利用OBO本体中术语的组合性,使用统一医学语言系统(UMLS)提供同义词和关系,并定义在语义上刻画概念之间关系的句法 - 语义模式。我们将其应用于四个OBO表型本体:小鼠病理学、人类疾病、哺乳动物表型和PATO。我们发现了386对等价概念以及20461对其中一个概念名称包含在另一个术语中的概念。在这20460个包含关系中,我们能够为2682个关系提供语义分类。在2552个案例中,该关系存在于UMLS元词表中并在语义上有定义,在131个案例中,该关系通过语义模式得以刻画。我们的方法可能有助于找到本体中概念之间的语义关系。