Department of Electronics and Computer Science, University of Santiago de Compostela, Edificio Monte da Condesa, Spain.
BMC Med Inform Decis Mak. 2012 Jul 31;12:78. doi: 10.1186/1472-6947-12-78.
Semantic Web technology can considerably catalyze translational genetics and genomics research in medicine, where the interchange of information between basic research and clinical levels becomes crucial. This exchange involves mapping abstract phenotype descriptions from research resources, such as knowledge databases and catalogs, to unstructured datasets produced through experimental methods and clinical practice. This is especially true for the construction of mutation databases. This paper presents a way of harmonizing abstract phenotype descriptions with patient data from clinical practice, and querying this dataset about relationships between phenotypes and genetic variants, at different levels of abstraction.
Due to the current availability of ontological and terminological resources that have already reached some consensus in biomedicine, a reuse-based ontology engineering approach was followed. The proposed approach uses the Ontology Web Language (OWL) to represent the phenotype ontology and the patient model, the Semantic Web Rule Language (SWRL) to bridge the gap between phenotype descriptions and clinical data, and the Semantic Query Web Rule Language (SQWRL) to query relevant phenotype-genotype bidirectional relationships. The work tests the use of semantic web technology in the biomedical research domain named cerebrotendinous xanthomatosis (CTX), using a real dataset and ontologies.
A framework to query relevant phenotype-genotype bidirectional relationships is provided. Phenotype descriptions and patient data were harmonized by defining 28 Horn-like rules in terms of the OWL concepts. In total, 24 patterns of SWQRL queries were designed following the initial list of competency questions. As the approach is based on OWL, the semantic of the framework adapts the standard logical model of an open world assumption.
This work demonstrates how semantic web technologies can be used to support flexible representation and computational inference mechanisms required to query patient datasets at different levels of abstraction. The open world assumption is especially good for describing only partially known phenotype-genotype relationships, in a way that is easily extensible. In future, this type of approach could offer researchers a valuable resource to infer new data from patient data for statistical analysis in translational research. In conclusion, phenotype description formalization and mapping to clinical data are two key elements for interchanging knowledge between basic and clinical research.
语义网技术可以极大地促进医学领域的转化遗传学和基因组学研究,因为基础研究和临床层面之间的信息交流变得至关重要。这种交流涉及将研究资源(如知识库和目录)中的抽象表型描述映射到通过实验方法和临床实践产生的非结构化数据集。对于突变数据库的构建尤其如此。本文提出了一种协调抽象表型描述与临床实践中患者数据的方法,并在不同的抽象层次上查询该数据集,以获取表型与遗传变异之间的关系。
由于目前存在的本体论和术语资源已经在生物医学领域达成了一定的共识,因此采用了基于重用的本体工程方法。所提出的方法使用本体网络语言(OWL)表示表型本体和患者模型,使用语义网络规则语言(SWRL)弥合表型描述和临床数据之间的差距,使用语义查询网络规则语言(SQWRL)查询相关的表型-基因型双向关系。该工作在一个名为脑腱黄瘤病(CTX)的生物医学研究领域测试了语义网技术的使用,使用了真实数据集和本体。
提供了一种查询相关表型-基因型双向关系的框架。通过定义 28 个基于 OWL 概念的 Horn 样规则,协调了表型描述和患者数据。总共设计了 24 个 SWQRL 查询模式,遵循最初的能力问题列表。由于该方法基于 OWL,因此框架的语义适应了开放世界假设的标准逻辑模型。
这项工作表明,语义网技术如何能够支持灵活的表示和计算推理机制,以便在不同的抽象层次上查询患者数据集。开放世界假设特别适合描述仅部分已知的表型-基因型关系,并且易于扩展。在未来,这种方法可以为研究人员提供一个有价值的资源,以便从患者数据中推断出新的数据,用于转化研究中的统计分析。总之,表型描述的形式化和与临床数据的映射是在基础研究和临床研究之间交换知识的两个关键要素。