Suppr超能文献

通过模型驱动的数据治理实现综合数据语义。

Integrative data semantics through a model-enabled data stewardship.

机构信息

Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Co mputing (SCAI), Sankt Augustin 53754, Germany.

Department of Neurology, University Hospital Bonn (UKB), Bonn 53127, Germany.

出版信息

Bioinformatics. 2022 Aug 2;38(15):3850-3852. doi: 10.1093/bioinformatics/btac375.

Abstract

MOTIVATION

The importance of clinical data in understanding the pathophysiology of complex disorders has prompted the launch of multiple initiatives designed to generate patient-level data from various modalities. While these studies can reveal important findings relevant to the disease, each study captures different yet complementary aspects and modalities which, when combined, generate a more comprehensive picture of disease etiology. However, achieving this requires a global integration of data across studies, which proves to be challenging given the lack of interoperability of cohort datasets.

RESULTS

Here, we present the Data Steward Tool (DST), an application that allows for semi-automatic semantic integration of clinical data into ontologies and global data models and data standards. We demonstrate the applicability of the tool in the field of dementia research by establishing a Clinical Data Model (CDM) in this domain. The CDM currently consists of 277 common variables covering demographics (e.g. age and gender), diagnostics, neuropsychological tests and biomarker measurements. The DST combined with this disease-specific data model shows how interoperability between multiple, heterogeneous dementia datasets can be achieved.

AVAILABILITY AND IMPLEMENTATION

The DST source code and Docker images are respectively available at https://github.com/SCAI-BIO/data-steward and https://hub.docker.com/r/phwegner/data-steward. Furthermore, the DST is hosted at https://data-steward.bio.scai.fraunhofer.de/data-steward.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

临床数据对于理解复杂疾病的病理生理学的重要性促使人们发起了多项倡议,旨在从各种模式生成患者水平的数据。虽然这些研究可以揭示与疾病相关的重要发现,但每项研究都捕捉到了不同但互补的方面和模式,当这些方面和模式结合起来时,可以更全面地了解疾病的病因。然而,要实现这一点,需要在全球范围内整合来自不同研究的数据,鉴于队列数据集之间缺乏互操作性,这一点证明是具有挑战性的。

结果

在这里,我们提出了数据管理员工具 (DST),这是一种允许将临床数据半自动地语义集成到本体和全局数据模型和数据标准中的应用程序。我们通过在该领域建立临床数据模型 (CDM) 来证明该工具的适用性。该 CDM 当前包含 277 个常见变量,涵盖人口统计学信息(例如年龄和性别)、诊断、神经心理学测试和生物标志物测量。DST 与这个特定于疾病的数据集结合,展示了如何实现多个异构痴呆症数据集之间的互操作性。

可用性和实现

DST 的源代码和 Docker 映像分别可在 https://github.com/SCAI-BIO/data-stewardhttps://hub.docker.com/r/phwegner/data-steward 获得。此外,DST 托管在 https://data-steward.bio.scai.fraunhofer.de/data-steward 上。

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4f5/9344835/932a1aea51fa/btac375f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验