Sharma Deepak K, Solbrig Harold R, Tao Cui, Weng Chunhua, Chute Christopher G, Jiang Guoqian
Department of Health Sciences Research, Mayo Clinic, 200 First St SW, Rochester, MN, 55905, USA.
University of Texas Health Science Center at Houston, Houston, TX, USA.
J Biomed Semantics. 2017 Jun 5;8(1):19. doi: 10.1186/s13326-017-0130-4.
Detailed Clinical Models (DCMs) have been regarded as the basis for retaining computable meaning when data are exchanged between heterogeneous computer systems. To better support clinical cancer data capturing and reporting, there is an emerging need to develop informatics solutions for standards-based clinical models in cancer study domains. The objective of the study is to develop and evaluate a cancer genome study metadata management system that serves as a key infrastructure in supporting clinical information modeling in cancer genome study domains.
We leveraged a Semantic Web-based metadata repository enhanced with both ISO11179 metadata standard and Clinical Information Modeling Initiative (CIMI) Reference Model. We used the common data elements (CDEs) defined in The Cancer Genome Atlas (TCGA) data dictionary, and extracted the metadata of the CDEs using the NCI Cancer Data Standards Repository (caDSR) CDE dataset rendered in the Resource Description Framework (RDF). The ITEM/ITEM_GROUP pattern defined in the latest CIMI Reference Model is used to represent reusable model elements (mini-Archetypes).
We produced a metadata repository with 38 clinical cancer genome study domains, comprising a rich collection of mini-Archetype pattern instances. We performed a case study of the domain "clinical pharmaceutical" in the TCGA data dictionary and demonstrated enriched data elements in the metadata repository are very useful in support of building detailed clinical models.
Our informatics approach leveraging Semantic Web technologies provides an effective way to build a CIMI-compliant metadata repository that would facilitate the detailed clinical modeling to support use cases beyond TCGA in clinical cancer study domains.
详细临床模型(DCMs)被视为异构计算机系统间数据交换时保留可计算意义的基础。为更好地支持临床癌症数据的采集和报告,在癌症研究领域开发基于标准的临床模型的信息学解决方案的需求日益凸显。本研究的目的是开发并评估一个癌症基因组研究元数据管理系统,该系统作为关键基础设施,用于支持癌症基因组研究领域的临床信息建模。
我们利用了一个基于语义网的元数据存储库,该存储库通过ISO11179元数据标准和临床信息建模倡议(CIMI)参考模型得到增强。我们使用了癌症基因组图谱(TCGA)数据字典中定义的通用数据元素(CDEs),并使用资源描述框架(RDF)中呈现的美国国立癌症研究所癌症数据标准存储库(caDSR)CDE数据集提取CDEs的元数据。最新CIMI参考模型中定义的ITEM/ITEM_GROUP模式用于表示可重用模型元素(微型原型)。
我们生成了一个包含38个临床癌症基因组研究领域的元数据存储库,其中包含丰富的微型原型模式实例集合。我们对TCGA数据字典中的“临床药学”领域进行了案例研究,并证明元数据存储库中丰富的数据元素在支持构建详细临床模型方面非常有用。
我们利用语义网技术的信息学方法提供了一种有效的方式来构建符合CIMI的元数据存储库,这将有助于进行详细的临床建模,以支持临床癌症研究领域中超越TCGA的用例。