Prakash Sharan J, Van Auken Kimberly M, Hill David P, Sternberg Paul W
1. Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA.
2. The Jackson Laboratory, Bar Harbor, ME, 04609 USA.
bioRxiv. 2023 Sep 26:2023.04.28.538760. doi: 10.1101/2023.04.28.538760.
In modern biology, new knowledge is generated quickly, making it challenging for researchers to efficiently acquire and synthesise new information from the large volume of primary publications. To address this problem, computational approaches that generate machine-readable representations of scientific findings in the form of knowledge graphs have been developed. These representations can integrate different types of experimental data from multiple papers and biological knowledge bases in a unifying data model, providing a complementary method to manual review for interacting with published knowledge. The Gene Ontology Consortium (GOC) has created a semantic modelling framework that extends individual functional gene annotations to structured descriptions of causal networks representing biological processes (Gene Ontology Causal Activity Modelling, or GO-CAM). In this study, we explored whether the GO-CAM framework could represent knowledge of the causal relationships between environmental inputs, neural circuits and behavior in the model nematode ( Neural Circuit Causal Activity Modelling (N-CAM)). We found that, given extensions to several relevant ontologies, a wide variety of author statements from the literature about the neural circuit basis of egg-laying and carbon dioxide (CO) avoidance behaviors could be faithfully represented with N-CAM. Through this process, we were able to generate generic data models for several categories of experimental results. We also discuss how semantic modelling may be used to functionally annotate the connectome. Thus, Gene Ontology-based semantic modelling has the potential to support various machine-readable representations of neurobiological knowledge.
在现代生物学中,新知识产生迅速,这使得研究人员难以从大量原始出版物中高效获取和综合新信息。为了解决这一问题,已开发出以知识图谱形式生成科学发现的机器可读表示的计算方法。这些表示可以在统一的数据模型中整合来自多篇论文和生物知识库的不同类型的实验数据,为与已发表知识进行交互的人工评审提供一种补充方法。基因本体联合会(GOC)创建了一个语义建模框架,该框架将单个功能基因注释扩展为代表生物过程的因果网络的结构化描述(基因本体因果活动建模,即GO-CAM)。在本研究中,我们探讨了GO-CAM框架是否能够在模式线虫中表示环境输入、神经回路和行为之间因果关系的知识(神经回路因果活动建模,即N-CAM)。我们发现,在对几个相关本体进行扩展后,文献中关于产卵和二氧化碳(CO)回避行为的神经回路基础的各种作者陈述可以用N-CAM如实地表示。通过这个过程,我们能够为几类实验结果生成通用数据模型。我们还讨论了语义建模如何用于对连接组进行功能注释。因此,基于基因本体的语义建模有潜力支持神经生物学知识的各种机器可读表示。