Suppr超能文献

使用CDISC标准和语义统计词汇对纵向临床研究数据进行语义丰富。

Semantic enrichment of longitudinal clinical study data using the CDISC standards and the semantic statistics vocabularies.

作者信息

Leroux Hugo, Lefort Laurent

机构信息

The Australian e-Health Research Centre, Digital Productivity Flagship, CSIRO, Level 5 - UQ Health Sciences Building 901/16, Brisbane, 4029 Queensland Australia.

Digital Economy Program, Digital Productivity Flagship, CSIRO, Canberra, 2601 ACT Australia.

出版信息

J Biomed Semantics. 2015 Apr 9;6:16. doi: 10.1186/s13326-015-0012-6. eCollection 2015.

Abstract

BACKGROUND

There is an increasing recognition of the need for the data capture phase of clinical studies to be improved and for more effective sharing of clinical data. The Health Care and Life Sciences community has embraced semantic technologies to facilitate the integration of health data from electronic health records, clinical studies and pharmaceutical research. This paper explores the integration of clinical study data exchange standards and semantic statistic vocabularies to deliver clinical data as linked data in a format that is easier to enrich with links to complementary data sources and consume by a broad user base.

METHODS

We propose a Linked Clinical Data Cube (LCDC), which combines the strength of the RDF Data Cube and DDI-RDF vocabulary to enrich clinical data based on the CDISC standards. The CDISC standards provide the mechanisms for the data to be standardised, made more accessible and accountable whereas the RDF Data Cube and DDI-RDF vocabularies provide novel approaches to managing large volumes of heterogeneous linked data resources.

RESULTS

We validate our approach using a large-scale longitudinal clinical study into neurodegenerative diseases. This dataset, comprising more than 1600 variables clustered in 25 different sub-domains, has been fully converted into RDF forming one main data cube and one specialised cube for each sub-domain. One sub-domain, the Medications specialised cube, has been linked to relevant external vocabularies, such as the Australian Medicines Terminology and the ATC DDD taxonomy and DrugBank terminology. This provides new dimensions on which to query the data that promote the exploration of drug-drug and drug-disease interactions.

CONCLUSIONS

This implementation highlights the effectiveness of the association of the semantic statistics vocabularies for the publication of large heterogeneous data sets as linked data and the integration of the semantic statistics vocabularies with the CDISC standards. In particular, it demonstrates the potential of the two vocabularies in overcoming the monolithic nature of the underlying model and improving the navigation and querying of the data from multiple angles to support richer data analysis of clinical study data. The forecasted benefits are more efficient use of clinicians' time and the potential to facilitate cross-study analysis.

摘要

背景

人们日益认识到需要改进临床研究的数据采集阶段,并更有效地共享临床数据。医疗保健和生命科学领域已采用语义技术来促进来自电子健康记录、临床研究和药物研究的健康数据的整合。本文探讨了临床研究数据交换标准与语义统计词汇的整合,以便以一种更易于通过与补充数据源的链接进行丰富并被广大用户群体使用的格式,将临床数据作为关联数据提供。

方法

我们提出了一个链接临床数据立方体(Linked Clinical Data Cube,LCDC),它结合了RDF数据立方体和DDI - RDF词汇表的优势,以基于CDISC标准丰富临床数据。CDISC标准提供了使数据标准化、更易于访问和问责的机制,而RDF数据立方体和DDI - RDF词汇表提供了管理大量异构链接数据资源的新方法。

结果

我们使用一项针对神经退行性疾病的大规模纵向临床研究来验证我们的方法。该数据集包含1600多个变量,分为25个不同的子领域,已完全转换为RDF,形成一个主数据立方体和每个子领域的一个专用立方体。一个子领域,即药物专用立方体,已与相关外部词汇表链接,如澳大利亚药品术语、ATC DDD分类法和DrugBank术语。这为查询数据提供了新的维度,有助于探索药物 - 药物和药物 - 疾病相互作用。

结论

该实施突出了语义统计词汇表用于将大型异构数据集作为关联数据发布以及语义统计词汇表与CDISC标准整合的有效性。特别是,它展示了这两个词汇表在克服基础模型的整体性方面的潜力,并从多个角度改进数据的导航和查询,以支持对临床研究数据进行更丰富的数据分析。预计的好处是更有效地利用临床医生的时间以及促进跨研究分析的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6670/4429421/91e56addc813/13326_2015_12_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验