Suppr超能文献

开发一个知识图谱框架以简化并增强植物研究中的转化方法:以豆科作物为例

Development of a knowledge graph framework to ease and empower translational approaches in plant research: a use-case on grain legumes.

作者信息

Imbert Baptiste, Kreplak Jonathan, Flores Raphaël-Gauthier, Aubert Grégoire, Burstin Judith, Tayeh Nadim

机构信息

Agroécologie, INRAE, Institut Agro, Univ. Bourgogne, Univ. Bourgogne Franche-Comté, Dijon, France.

Université Paris-Saclay, INRAE, URGI, Versailles, France.

出版信息

Front Artif Intell. 2023 Aug 3;6:1191122. doi: 10.3389/frai.2023.1191122. eCollection 2023.

Abstract

While the continuing decline in genotyping and sequencing costs has largely benefited plant research, some key species for meeting the challenges of agriculture remain mostly understudied. As a result, heterogeneous datasets for different traits are available for a significant number of these species. As gene structures and functions are to some extent conserved through evolution, comparative genomics can be used to transfer available knowledge from one species to another. However, such a translational research approach is complex due to the multiplicity of data sources and the non-harmonized description of the data. Here, we provide two pipelines, referred to as structural and functional pipelines, to create a framework for a NoSQL graph-database (Neo4j) to integrate and query heterogeneous data from multiple species. We call this framework Orthology-driven knowledge base framework for translational research (Ortho_KB). The structural pipeline builds bridges across species based on orthology. The functional pipeline integrates biological information, including QTL, and RNA-sequencing datasets, and uses the backbone from the structural pipeline to connect orthologs in the database. Queries can be written using the Neo4j Cypher language and can, for instance, lead to identify genes controlling a common trait across species. To explore the possibilities offered by such a framework, we populated Ortho_KB to obtain OrthoLegKB, an instance dedicated to legumes. The proposed model was evaluated by studying the conservation of a flowering-promoting gene. Through a series of queries, we have demonstrated that our knowledge graph base provides an intuitive and powerful platform to support research and development programmes.

摘要

虽然基因分型和测序成本的持续下降在很大程度上惠及了植物研究,但一些应对农业挑战的关键物种仍大多未得到充分研究。因此,大量此类物种拥有不同性状的异构数据集。由于基因结构和功能在一定程度上通过进化得以保守,比较基因组学可用于将现有知识从一个物种转移到另一个物种。然而,由于数据源的多样性和数据描述的不统一,这种转化研究方法很复杂。在此,我们提供了两个管道,称为结构管道和功能管道,以创建一个用于非关系型图数据库(Neo4j)的框架,用于整合和查询来自多个物种的异构数据。我们将这个框架称为用于转化研究的直系同源驱动知识库框架(Ortho_KB)。结构管道基于直系同源性在物种间架起桥梁。功能管道整合生物信息,包括数量性状基因座和RNA测序数据集,并利用结构管道的主干在数据库中连接直系同源物。查询可以使用Neo4j Cypher语言编写,例如,可以识别跨物种控制共同性状的基因。为了探索这样一个框架提供的可能性,我们填充了Ortho_KB以获得OrthoLegKB,一个专门用于豆科植物的实例。通过研究促进开花基因的保守性对所提出的模型进行了评估。通过一系列查询,我们证明了我们的知识图谱库为支持研发计划提供了一个直观且强大的平台。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d676/10435283/87954bf1727b/frai-06-1191122-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验