Cox Steven, Ahalt Stanley C, Balhoff James, Bizon Chris, Fecho Karamarie, Kebede Yaphet, Morton Kenneth, Tropsha Alexander, Wang Patrick, Xu Hao
Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States.
CoVar Applied Technologies, Durham, NC, United States.
JMIR Med Inform. 2020 Nov 23;8(11):e17964. doi: 10.2196/17964.
Efforts are underway to semantically integrate large biomedical knowledge graphs using common upper-level ontologies to federate graph-oriented application programming interfaces (APIs) to the data. However, federation poses several challenges, including query routing to appropriate knowledge sources, generation and evaluation of answer subsets, semantic merger of those answer subsets, and visualization and exploration of results.
We aimed to develop an interactive environment for query, visualization, and deep exploration of federated knowledge graphs.
We developed a biomedical query language and web application interphase-termed as Translator Query Language (TranQL)-to query semantically federated knowledge graphs and explore query results. TranQL uses the Biolink data model as an upper-level biomedical ontology and an API standard that has been adopted by the Biomedical Data Translator Consortium to specify a protocol for expressing a query as a graph of Biolink data elements compiled from statements in the TranQL query language. Queries are mapped to federated knowledge sources, and answers are merged into a knowledge graph, with mappings between the knowledge graph and specific elements of the query. The TranQL interactive web application includes a user interface to support user exploration of the federated knowledge graph.
We developed 2 real-world use cases to validate TranQL and address biomedical questions of relevance to translational science. The use cases posed questions that traversed 2 federated Translator API endpoints: Integrated Clinical and Environmental Exposures Service (ICEES) and Reasoning Over Biomedical Objects linked in Knowledge Oriented Pathways (ROBOKOP). ICEES provides open access to observational clinical and environmental data, and ROBOKOP provides access to linked biomedical entities, such as "gene," "chemical substance," and "disease," that are derived largely from curated public data sources. We successfully posed queries to TranQL that traversed these endpoints and retrieved answers that we visualized and evaluated.
TranQL can be used to ask questions of relevance to translational science, rapidly obtain answers that require assertions from a federation of knowledge sources, and provide valuable insights for translational research and clinical practice.
目前正在努力使用通用的上层本体对大型生物医学知识图谱进行语义整合,以便将面向图谱的应用程序编程接口(API)联合到数据中。然而,联合带来了几个挑战,包括查询路由到适当的知识源、答案子集的生成和评估、这些答案子集的语义合并以及结果的可视化和探索。
我们旨在开发一个用于联合知识图谱查询、可视化和深度探索的交互式环境。
我们开发了一种生物医学查询语言和Web应用程序界面——称为翻译查询语言(TranQL)——用于查询语义联合的知识图谱并探索查询结果。TranQL使用生物链接数据模型作为上层生物医学本体和API标准,生物医学数据翻译联盟已采用该标准来指定一种协议,用于将查询表示为由TranQL查询语言中的语句编译而成的生物链接数据元素图。查询被映射到联合知识源,答案被合并到一个知识图谱中,并在知识图谱和查询的特定元素之间建立映射。TranQL交互式Web应用程序包括一个用户界面,以支持用户探索联合知识图谱。
我们开发了2个实际用例来验证TranQL并解决与转化科学相关的生物医学问题。这些用例提出的问题跨越了2个联合翻译器API端点:综合临床和环境暴露服务(ICEES)以及知识导向路径中链接的生物医学对象推理(ROBOKOP)。ICEES提供对观察性临床和环境数据的开放访问,而ROBOKOP提供对链接的生物医学实体(如“基因”、“化学物质 ”和“疾病”)的访问,这些实体主要来自经过整理的公共数据源。我们成功地向TranQL提出了跨越这些端点的查询,并检索到了我们进行可视化和评估的答案。
TranQL可用于提出与转化科学相关的问题,快速获得需要来自知识源联合的断言的答案,并为转化研究和临床实践提供有价值的见解。