Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Digital Health Center, Kapelle-Ufer 2, 10117 Berlin, Germany.
VeraTech for Health, Avenida del Puerto 237 - Puerta 1, Valencia, Spain.
J Biomed Inform. 2023 Aug;144:104437. doi: 10.1016/j.jbi.2023.104437. Epub 2023 Jul 12.
The reuse of data from electronic health records (EHRs) for research purposes promises to improve the data foundation for clinical trials and may even support to enable them. Nevertheless, EHRs are characterized by both, heterogeneous structure and semantics. To standardize this data for research, the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) standard has recently seen an increase in use. However, the conversion of these EHRs into the OMOP CDM requires complex and resource intensive Extract Transform and Load (ETL) processes. This hampers the reuse of clinical data for research. To solve the issues of heterogeneity of EHRs and the lack of semantic precision on the care site, the openEHR standard has recently seen wider adoption. A standardized process to integrate openEHR records into the CDM potentially lowers the barriers of making EHRs accessible for research. Yet, a comprehensive approach about the integration of openEHR records into the OMOP CDM has not yet been made.
We analyzed both standards and compared their models to identify possible mappings. Based on this, we defined the necessary processes to transform openEHR records into CDM tables. We also discuss the limitation of openEHR with its unspecific demographics model and propose two possible solutions.
We developed the OMOP Conversion Language (OMOCL) which enabled us to define a declarative openEHR archetype-to-CDM mapping language. Using OMOCL, it was possible to define a set of mappings. As a proof-of-concept, we implemented the Eos tool, which uses the OMOCL-files to successfully automatize the ETL from real-world and sample EHRs into the OMOP CDM.
Both Eos and OMOCL provide a way to define generic mappings for an integration of openEHR records into OMOP. Thus, it represents a significant step towards achieving interoperability between the clinical and the research data domains. However, the transformation of openEHR data into the less expressive OMOP CDM leads to a loss of semantics.
为研究目的而重复使用电子健康记录 (EHR) 中的数据有望改善临床试验的数据基础,甚至可能支持实现这一目标。然而,EHR 的结构和语义都具有异质性。为了标准化这些数据以用于研究,观察性医疗结果伙伴关系 (OMOP) 通用数据模型 (CDM) 标准最近得到了更广泛的应用。然而,将这些 EHR 转换为 OMOP CDM 需要复杂且资源密集型的提取、转换和加载 (ETL) 过程。这阻碍了临床数据在研究中的重复使用。为了解决 EHR 的异质性和护理现场语义精度缺乏的问题,最近更广泛地采用了 openEHR 标准。一种将 openEHR 记录集成到 CDM 的标准化流程可能降低了使 EHR 可用于研究的障碍。然而,尚未提出将 openEHR 记录集成到 OMOP CDM 的综合方法。
我们分析了这两个标准,并比较了它们的模型以确定可能的映射。在此基础上,我们定义了将 openEHR 记录转换为 CDM 表所需的流程。我们还讨论了 openEHR 不明确的人口统计学模型的局限性,并提出了两种可能的解决方案。
我们开发了 OMOP 转换语言 (OMOCL),它使我们能够定义一种声明式的 openEHR 原型到 CDM 的映射语言。使用 OMOCL,可以定义一组映射。作为概念验证,我们实现了 Eos 工具,它使用 OMOCL 文件成功地将来自真实世界和示例 EHR 的 ETL 自动转换为 OMOP CDM。
Eos 和 OMOCL 都为将 openEHR 记录集成到 OMOP 中定义通用映射提供了一种方法。因此,它代表了在临床和研究数据领域实现互操作性的重要一步。然而,将 openEHR 数据转换为表达能力较弱的 OMOP CDM 会导致语义丢失。