King Abdullah University of Science and Technology.
Computational Bioscience Research Center and lead of the Structural and Functional Bioinformatics Group at King Abdullah University of Science and Technology.
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa199.
Ontologies have long been employed in the life sciences to formally represent and reason over domain knowledge and they are employed in almost every major biological database. Recently, ontologies are increasingly being used to provide background knowledge in similarity-based analysis and machine learning models. The methods employed to combine ontologies and machine learning are still novel and actively being developed. We provide an overview over the methods that use ontologies to compute similarity and incorporate them in machine learning methods; in particular, we outline how semantic similarity measures and ontology embeddings can exploit the background knowledge in ontologies and how ontologies can provide constraints that improve machine learning models. The methods and experiments we describe are available as a set of executable notebooks, and we also provide a set of slides and additional resources at https://github.com/bio-ontology-research-group/machine-learning-with-ontologies.
本体在生命科学中被长期用于正式表示和推理领域知识,并且几乎被用于每个主要的生物数据库中。最近,本体越来越多地被用于在基于相似性的分析和机器学习模型中提供背景知识。用于结合本体和机器学习的方法仍然是新颖的,并且正在积极开发中。我们提供了一种概述,介绍了使用本体计算相似性并将其纳入机器学习方法的方法;特别是,我们概述了语义相似性度量和本体嵌入如何利用本体中的背景知识,以及本体如何提供可以改进机器学习模型的约束。我们描述的方法和实验可作为一组可执行的笔记本使用,我们还在 https://github.com/bio-ontology-research-group/machine-learning-with-ontologies 上提供了一组幻灯片和其他资源。