University of Bergen, Norway; University of Oslo, Norway; Ontopic S.r.l., Italy.
University of North Carolina, Chapel Hill, NC, USA.
J Biomed Inform. 2022 Oct;134:104201. doi: 10.1016/j.jbi.2022.104201. Epub 2022 Sep 9.
Knowledge graphs (KGs) play a key role to enable explainable artificial intelligence (AI) applications in healthcare. Constructing clinical knowledge graphs (CKGs) against heterogeneous electronic health records (EHRs) has been desired by the research and healthcare AI communities. From the standardization perspective, community-based standards such as the Fast Healthcare Interoperability Resources (FHIR) and the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) are increasingly used to represent and standardize EHR data for clinical data analytics, however, the potential of such a standard on building CKG has not been well investigated.
To develop and evaluate methods and tools that expose the OMOP CDM-based clinical data repositories into virtual clinical KGs that are compliant with FHIR Resource Description Framework (RDF) specification.
We developed a system called FHIR-Ontop-OMOP to generate virtual clinical KGs from the OMOP relational databases. We leveraged an OMOP CDM-based Medical Information Mart for Intensive Care (MIMIC-III) data repository to evaluate the FHIR-Ontop-OMOP system in terms of the faithfulness of data transformation and the conformance of the generated CKGs to the FHIR RDF specification.
A beta version of the system has been released. A total of more than 100 data element mappings from 11 OMOP CDM clinical data, health system and vocabulary tables were implemented in the system, covering 11 FHIR resources. The generated virtual CKG from MIMIC-III contains 46,520 instances of FHIR Patient, 716,595 instances of Condition, 1,063,525 instances of Procedure, 24,934,751 instances of MedicationStatement, 365,181,104 instances of Observations, and 4,779,672 instances of CodeableConcept. Patient counts identified by five pairs of SQL (over the MIMIC database) and SPARQL (over the virtual CKG) queries were identical, ensuring the faithfulness of the data transformation. Generated CKG in RDF triples for 100 patients were fully conformant with the FHIR RDF specification.
The FHIR-Ontop-OMOP system can expose OMOP database as a FHIR-compliant RDF graph. It provides a meaningful use case demonstrating the potentials that can be enabled by the interoperability between FHIR and OMOP CDM. Generated clinical KGs in FHIR RDF provide a semantic foundation to enable explainable AI applications in healthcare.
知识图谱(KGs)在实现可解释的人工智能(AI)应用方面发挥着关键作用。研究和医疗 AI 社区都希望针对异构电子健康记录(EHRs)构建临床知识图谱(CKGs)。从标准化的角度来看,Fast Healthcare Interoperability Resources(FHIR)和 Observational Medical Outcomes Partnership(OMOP)通用数据模型(CDM)等基于社区的标准越来越多地用于代表和标准化 EHR 数据,以进行临床数据分析,但是,这种标准在构建 CKG 方面的潜力尚未得到充分研究。
开发和评估将基于 OMOP CDM 的临床数据存储库公开为符合 FHIR 资源描述框架(RDF)规范的虚拟临床 KG 的方法和工具。
我们开发了一个名为 FHIR-Ontop-OMOP 的系统,用于从 OMOP 关系数据库生成虚拟临床 KG。我们利用基于 OMOP CDM 的医疗信息市场重症监护(MIMIC-III)数据存储库,从数据转换的忠实度和生成的 CKG 对 FHIR RDF 规范的遵从性两个方面评估 FHIR-Ontop-OMOP 系统。
系统的测试版已经发布。该系统共实现了 11 个 OMOP CDM 临床数据、健康系统和词汇表中的 100 多个数据元素映射,涵盖了 11 个 FHIR 资源。从 MIMIC-III 生成的虚拟 CKG 包含 46520 个 FHIR 患者实例、716595 个条件实例、1063525 个过程实例、24934751 个药物声明实例、365181104 个观察实例和 4779672 个可编码概念实例。通过五个 MIMIC 数据库上的 SQL(查询)和虚拟 CKG 上的 SPARQL(查询)对患者计数进行的对比,结果完全一致,确保了数据转换的忠实度。针对 100 名患者生成的 CKG 的 RDF 三元组完全符合 FHIR RDF 规范。
FHIR-Ontop-OMOP 系统可以将 OMOP 数据库公开为符合 FHIR 的 RDF 图。它提供了一个有意义的用例,展示了 FHIR 和 OMOP CDM 之间的互操作性可以带来的潜力。在 FHIR RDF 中生成的临床 KG 为可解释的 AI 应用提供了语义基础。