Data Science Institute, NUI Galway, Galway,
Insight Centre for Data Analytics, NUI Galway, Galway, Ireland.
Brief Bioinform. 2021 Mar 22;22(2):1679-1693. doi: 10.1093/bib/bbaa012.
Complex biological systems are traditionally modelled as graphs of interconnected biological entities. These graphs, i.e. biological knowledge graphs, are then processed using graph exploratory approaches to perform different types of analytical and predictive tasks. Despite the high predictive accuracy of these approaches, they have limited scalability due to their dependency on time-consuming path exploratory procedures. In recent years, owing to the rapid advances of computational technologies, new approaches for modelling graphs and mining them with high accuracy and scalability have emerged. These approaches, i.e. knowledge graph embedding (KGE) models, operate by learning low-rank vector representations of graph nodes and edges that preserve the graph's inherent structure. These approaches were used to analyse knowledge graphs from different domains where they showed superior performance and accuracy compared to previous graph exploratory approaches. In this work, we study this class of models in the context of biological knowledge graphs and their different applications. We then show how KGE models can be a natural fit for representing complex biological knowledge modelled as graphs. We also discuss their predictive and analytical capabilities in different biology applications. In this regard, we present two example case studies that demonstrate the capabilities of KGE models: prediction of drug-target interactions and polypharmacy side effects. Finally, we analyse different practical considerations for KGEs, and we discuss possible opportunities and challenges related to adopting them for modelling biological systems.
复杂的生物系统传统上被建模为相互连接的生物实体的图。然后,使用图探索方法处理这些图,即生物知识图,以执行不同类型的分析和预测任务。尽管这些方法具有很高的预测准确性,但由于它们依赖于耗时的路径探索过程,因此它们的可扩展性有限。近年来,由于计算技术的快速发展,出现了用于对图进行建模并以高精度和可扩展性对其进行挖掘的新方法。这些方法,即知识图嵌入 (KGE) 模型,通过学习图节点和边的低秩向量表示来操作,这些表示保留了图的固有结构。这些方法已用于分析来自不同领域的知识图,与以前的图探索方法相比,它们表现出了优越的性能和准确性。在这项工作中,我们在生物知识图及其不同应用的背景下研究了这一类模型。然后,我们展示了 KGE 模型如何自然适用于表示建模为图的复杂生物知识。我们还讨论了它们在不同生物学应用中的预测和分析能力。在这方面,我们提出了两个示例案例研究,展示了 KGE 模型的能力:药物-靶标相互作用的预测和多药副作用。最后,我们分析了 KGE 的不同实际考虑因素,并讨论了采用它们对生物系统进行建模的相关机会和挑战。