Pavel Alisa, Saarimäki Laura A, Möbus Lena, Federico Antonio, Serra Angela, Greco Dario
Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.
BioMediTech Institute, Tampere University, Tampere, Finland.
Comput Struct Biotechnol J. 2022 Sep 5;20:4837-4849. doi: 10.1016/j.csbj.2022.08.061. eCollection 2022.
Big Data pervades nearly all areas of life sciences, yet the analysis of large integrated data sets remains a major challenge. Moreover, the field of life sciences is highly fragmented and, consequently, so is its data, knowledge, and standards. This, in turn, makes integrated data analysis and knowledge gathering across sub-fields a demanding task. At the same time, the integration of various research angles and data types is crucial for modelling the complexity of organisms and biological processes in a holistic manner. This is especially valid in the context of drug development and chemical safety assessment where computational methods can provide solutions for the urgent need of fast, effective, and sustainable approaches. At the same time, such computational methods require the development of methodologies suitable for an integrated and data centred Big Data view. Here we discuss Knowledge Graphs (KG) as a solution to a data centred analysis approach for drug and chemical development and safety assessment. KGs are knowledge bases, data analysis engines, and knowledge discovery systems all in one, allowing them to be used from simple data retrieval, over meta-analysis to complex predictive and knowledge discovery systems. Therefore, KGs have immense potential to advance the data centred approach, the re-usability, and informativity of data. Furthermore, they can improve the power of analysis, and the complexity of modelled processes, all while providing knowledge in a natively human understandable network data model.
大数据几乎渗透到生命科学的所有领域,但对大型综合数据集的分析仍然是一项重大挑战。此外,生命科学领域高度分散,因此其数据、知识和标准也是如此。这反过来又使得跨子领域的综合数据分析和知识收集成为一项艰巨的任务。与此同时,整合各种研究角度和数据类型对于全面模拟生物体和生物过程的复杂性至关重要。在药物开发和化学安全评估的背景下尤其如此,在这些领域中,计算方法可以为快速、有效和可持续方法的迫切需求提供解决方案。同时,此类计算方法需要开发适合以数据为中心的综合大数据视图的方法。在这里,我们将讨论知识图谱(KG),它是一种用于药物和化学开发及安全评估的以数据为中心的分析方法的解决方案。知识图谱集知识库、数据分析引擎和知识发现系统于一体,使其可用于从简单的数据检索、元分析到复杂的预测和知识发现系统。因此,知识图谱具有巨大的潜力来推进以数据为中心的方法、数据的可重用性和信息性。此外,它们可以提高分析能力和建模过程的复杂性,同时以原生人类可理解的网络数据模型提供知识。