Suppr超能文献

基于图卷积网络特征传播的生物医学本体匹配。

Matching biomedical ontologies with GCN-based feature propagation.

机构信息

School of Computer Science and Engineering, Southeast University, Nanjing 210018, China.

Monash University Joint Graduate School, Southeast University, Suzhou 215123, China.

出版信息

Math Biosci Eng. 2022 Jun 9;19(8):8479-8504. doi: 10.3934/mbe.2022394.

Abstract

With an increasing number of biomedical ontologies being evolved independently, matching these ontologies to solve the interoperability problem has become a critical issue in biomedical applications. Traditional biomedical ontology matching methods are mostly based on rules or similarities for concepts and properties. These approaches require manually designed rules that not only fail to address the heterogeneity of domain ontology terminology and the ambiguity of multiple meanings of words, but also make it difficult to capture structural information in ontologies that contain a large amount of semantics during matching. Recently, various knowledge graph (KG) embedding techniques utilizing deep learning methods to deal with the heterogeneity in knowledge graphs (KGs), have quickly gained massive attention. However, KG embedding focuses mainly on entity alignment (EA). EA tasks and ontology matching (OM) tasks differ dramatically in terms of matching elements, semantic information and application scenarios, etc., hence these methods cannot be applied directly to biomedical ontologies that contain abstract concepts but almost no entities. To tackle these issues, this paper proposes a novel approach called BioOntGCN that directly learns embeddings of ontology-pairs for biomedical ontology matching. Specifically, we first generate a pair-wise connectivity graph (PCG) of two ontologies, whose nodes are concept-pairs and edges correspond to property-pairs. Subsequently, we learn node embeddings of the PCG to predicate the matching results through following phases: 1) A convolutional neural network (CNN) to extract the similarity feature vectors of nodes; 2) A graph convolutional network (GCN) to propagate the similarity features and obtain the final embeddings of concept-pairs. Consequently, the biomedical ontology matching problem is transformed into a binary classification problem. We conduct systematic experiments on real-world biomedical ontologies in Ontology Alignment Evaluation Initiative (OAEI), and the results show that our approach significantly outperforms other entity alignment methods and achieves state-of-the-art performance. This indicates that BioOntGCN is more applicable to ontology matching than the EA method. At the same time, BioOntGCN substantially achieves superior performance compared with previous ontology matching (OM) systems, which suggests that BioOntGCN based on the representation learning is more effective than the traditional approaches.

摘要

随着越来越多的生物医学本体独立发展,将这些本体进行匹配以解决互操作性问题已成为生物医学应用中的关键问题。传统的生物医学本体匹配方法主要基于概念和属性的规则或相似性。这些方法需要手动设计规则,不仅无法解决领域本体术语的异构性和单词多义性的模糊性,而且在匹配包含大量语义的本体时,也难以捕获结构信息。最近,各种利用深度学习方法处理知识图 (KG) 异构性的知识图 (KG) 嵌入技术引起了广泛关注。然而,KG 嵌入主要侧重于实体对齐 (EA)。EA 任务和本体匹配 (OM) 任务在匹配元素、语义信息和应用场景等方面有很大的不同,因此这些方法不能直接应用于几乎没有实体但包含抽象概念的生物医学本体。针对这些问题,本文提出了一种名为 BioOntGCN 的新方法,用于直接学习生物医学本体匹配的本体对嵌入。具体来说,我们首先为两个本体生成一个两两连接图 (PCG),其节点是概念对,边对应属性对。随后,我们学习 PCG 的节点嵌入,通过以下阶段预测匹配结果:1)卷积神经网络 (CNN) 提取节点的相似性特征向量;2)图卷积网络 (GCN) 传播相似性特征并获得概念对的最终嵌入。因此,生物医学本体匹配问题转化为二分类问题。我们在本体对齐评估倡议 (OAEI) 中的真实生物医学本体上进行了系统实验,结果表明,我们的方法明显优于其他实体对齐方法,并达到了最新的性能。这表明 BioOntGCN 比 EA 方法更适用于本体匹配。同时,BioOntGCN 与以前的本体匹配 (OM) 系统相比,性能显著提高,这表明基于表示学习的 BioOntGCN 比传统方法更有效。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验