NICHE Research Group, Faculty of Computer Science, Dalhousie University, Canada.
Medical Informatics, Department of Community Health and Epidemiology, Dalhousie University, Canada.
Stud Health Technol Inform. 2021 May 27;281:724-728. doi: 10.3233/SHTI210267.
This paper explores the use of semantic- and evidence-based biomedical knowledge to build the RiskExplorer knowledge graph that outlines causal associations between risk factors and chronic disease or cancers. The intent of this work is to offer an interactive knowledge synthesis platform to empower health-information-seeking individuals to learn about and mitigate modifiable risk factors. Our approach analyzes biomedical text (from PubMed abstracts), Semantic Medline database, evidence-based semantic associations, literature-based discovery, and graph database to discover associations between risk factors and breast cancer. Our methodological framework involves (a) identifying relevant literature on specified chronic diseases or cancers, (b) extracting semantic associations via knowledge mining tool, (c) building rich semantic graph by transforming semantic associations to nodes and edges, (d) applying frequency-based methods and using semantic edge properties to traverse the graph and identify meaningful multi-node NCD risk paths. Generated multi-node risk paths consist of a source node (representing the source risk factor), one or more intermediate nodes (representing biomedical phenotypes), a target node (representing a chronic disease or cancer), and edges between nodes representing meaningful semantic associations. The results demonstrate that our methodology is capable of generating biomedically valid knowledge related to causal risk and protective factors related to breast cancer.
本文探讨了使用基于语义和证据的生物医学知识来构建 RiskExplorer 知识图谱,该图谱概述了风险因素与慢性疾病或癌症之间的因果关系。这项工作的目的是提供一个交互式的知识综合平台,使寻求健康信息的个人能够了解和减轻可改变的风险因素。我们的方法分析生物医学文本(来自 PubMed 摘要)、语义 Medline 数据库、基于证据的语义关联、基于文献的发现和图形数据库,以发现风险因素与乳腺癌之间的关联。我们的方法框架包括:(a)确定指定慢性疾病或癌症的相关文献;(b)通过知识挖掘工具提取语义关联;(c)通过将语义关联转换为节点和边来构建丰富的语义图;(d)应用基于频率的方法,并使用语义边属性遍历图,以识别有意义的多节点非传染性疾病风险路径。生成的多节点风险路径由源节点(代表源风险因素)、一个或多个中间节点(代表生物医学表型)、目标节点(代表慢性疾病或癌症)以及节点之间代表有意义的语义关联的边组成。结果表明,我们的方法能够生成与乳腺癌相关的因果风险和保护因素的生物医学有效知识。