Suppr超能文献

迈向 HL7 FHIR 和 OMOP CDM 中基因组数据的表示。

Towards the Representation of Genomic Data in HL7 FHIR and OMOP CDM.

机构信息

Institute for Medical Informatics and Biometry at Carl Gustav Carus Faculty of Medicine at Technische Universität Dresden, Germany.

出版信息

Stud Health Technol Inform. 2021 Sep 21;283:86-94. doi: 10.3233/SHTI210545.

Abstract

High throughput sequencing technologies have facilitated an outburst in biological knowledge over the past decades and thus enables improvements in personalized medicine. In order to support (international) medical research with the combination of genomic and clinical patient data, a standardization and harmonization of these data sources is highly desirable. To support this increasing importance of genomic data, we have created semantic mapping from raw genomic data to both FHIR (Fast Healthcare Interoperability Resources) and OMOP (Observational Medical Outcomes Partnership) CDM (Common Data Model) and analyzed the data coverage of both models. For this, we calculated the mapping score for different data categories and the relative data coverage in both FHIR and OMOP CDM. Our results show, that the patients genomic data can be mapped to OMOP CDM directly from VCF (Variant Call Format) file with a coverage of slightly over 50%. However, using FHIR as intermediate representation does not lead to further information loss as the already stored data in FHIR can be further transformed into OMOP CDM format with almost 100% success. Our findings are in favor of extending OMOP CDM with patient genomic data using ETL to enable the researchers to apply different analysis methods including machine learning algorithms on genomic data.

摘要

高通量测序技术在过去几十年中极大地推动了生物学知识的爆发式增长,从而促进了个性化医疗的发展。为了结合基因组和临床患者数据支持(国际)医学研究,非常需要对这些数据源进行标准化和协调。为了支持基因组数据的重要性日益增加,我们创建了从原始基因组数据到 FHIR(快速医疗互操作性资源)和 OMOP(观察性医疗结果伙伴关系)CDM(通用数据模型)的语义映射,并分析了这两个模型的数据覆盖范围。为此,我们为不同的数据类别计算了映射分数,并计算了 FHIR 和 OMOP CDM 中的相对数据覆盖范围。我们的结果表明,可以直接从 VCF(变体调用格式)文件将患者的基因组数据映射到 OMOP CDM,覆盖范围略高于 50%。然而,使用 FHIR 作为中间表示并不会导致信息丢失,因为已经存储在 FHIR 中的数据可以几乎 100%成功地进一步转换为 OMOP CDM 格式。我们的发现支持使用 ETL 将患者基因组数据扩展到 OMOP CDM 中,从而使研究人员能够在基因组数据上应用不同的分析方法,包括机器学习算法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验