Suppr超能文献

语义临床数据集成(semCDI):caBIG中语义数据集成的查询公式化

semCDI: a query formulation for semantic data integration in caBIG.

作者信息

Shironoshita E Patrick, Jean-Mary Yves R, Bradley Ray M, Kabuka Mansur R

机构信息

INFOTECH Soft, Inc., 9200 Dadeland Blvd., Ste 620, Miami, FL 33156, USA.

出版信息

J Am Med Inform Assoc. 2008 Jul-Aug;15(4):559-68. doi: 10.1197/jamia.M2732. Epub 2008 Apr 24.

Abstract

OBJECTIVES

To develop mechanisms to formulate queries over the semantic representation of cancer-related data services available through the cancer Biomedical Informatics Grid (caBIG).

DESIGN

The semCDI query formulation uses a view of caBIG semantic concepts, metadata, and data as an ontology, and defines a methodology to specify queries using the SPARQL query language, extended with Horn rules. semCDI enables the joining of data that represent different concepts through associations modeled as object properties, and the merging of data representing the same concept in different sources through Common Data Elements (CDE) modeled as datatype properties, using Horn rules to specify additional semantics indicating conditions for merging data. Validation In order to validate this formulation, a prototype has been constructed, and two queries have been executed against currently available caBIG data services.

DISCUSSION

The semCDI query formulation uses the rich semantic metadata available in caBIG to build queries and integrate data from multiple sources. Its promise will be further enhanced as more data services are registered in caBIG, and as more linkages can be achieved between the knowledge contained within caBIG's NCI Thesaurus and the data contained in the Data Services.

CONCLUSION

semCDI provides a formulation for the creation of queries on the semantic representation of caBIG. This constitutes the foundation to build a semantic data integration system for more efficient and effective querying and exploratory searching of cancer-related data.

摘要

目标

开发机制,以便针对通过癌症生物医学信息学网格(caBIG)提供的癌症相关数据服务的语义表示来制定查询。

设计

semCDI查询制定将caBIG语义概念、元数据和数据的视图用作本体,并定义一种使用SPARQL查询语言(通过霍恩规则进行扩展)来指定查询的方法。semCDI能够通过建模为对象属性的关联来连接表示不同概念的数据,并通过建模为数据类型属性的公共数据元素(CDE)来合并不同源中表示相同概念的数据,使用霍恩规则来指定指示数据合并条件的附加语义。验证为了验证这种制定方法,构建了一个原型,并针对当前可用的caBIG数据服务执行了两个查询。

讨论

semCDI查询制定利用caBIG中可用的丰富语义元数据来构建查询并集成来自多个源的数据。随着更多数据服务在caBIG中注册,以及caBIG的NCI叙词表中包含的知识与数据服务中包含的数据之间能够实现更多的链接,其前景将得到进一步提升。

结论

semCDI为基于caBIG语义表示创建查询提供了一种制定方法。这构成了构建语义数据集成系统的基础,以便更高效、有效地查询和探索癌症相关数据。

相似文献

2
A journey to Semantic Web query federation in the life sciences.生命科学中的语义网查询联邦之旅。
BMC Bioinformatics. 2009 Oct 1;10 Suppl 10(Suppl 10):S10. doi: 10.1186/1471-2105-10-S10-S10.
8
Federated ontology-based queries over cancer data.基于联邦本体的癌症数据查询。
BMC Bioinformatics. 2012 Jan 25;13 Suppl 1(Suppl 1):S9. doi: 10.1186/1471-2105-13-S1-S9.
9

本文引用的文献

8
Data integration and genomic medicine.数据整合与基因组医学。
J Biomed Inform. 2007 Feb;40(1):5-16. doi: 10.1016/j.jbi.2006.02.007. Epub 2006 Mar 9.
9
An XML-based system for synthesis of data from disparate databases.一种用于综合来自不同数据库数据的基于XML的系统。
J Am Med Inform Assoc. 2006 May-Jun;13(3):289-301. doi: 10.1197/jamia.M1848. Epub 2006 Feb 24.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验