Mate Sebastian, Köpcke Felix, Toddenroth Dennis, Martin Marcus, Prokosch Hans-Ulrich, Bürkle Thomas, Ganslandt Thomas
Institute for Medical Informatics, University Erlangen-Nuremberg, Erlangen, Germany.
Center for Medical Information and Communication, Erlangen University Hospital, Erlangen, Germany.
PLoS One. 2015 Jan 14;10(1):e0116656. doi: 10.1371/journal.pone.0116656. eCollection 2015.
Data from the electronic medical record comprise numerous structured but uncoded elements, which are not linked to standard terminologies. Reuse of such data for secondary research purposes has gained in importance recently. However, the identification of relevant data elements and the creation of database jobs for extraction, transformation and loading (ETL) are challenging: With current methods such as data warehousing, it is not feasible to efficiently maintain and reuse semantically complex data extraction and trans-formation routines. We present an ontology-supported approach to overcome this challenge by making use of abstraction: Instead of defining ETL procedures at the database level, we use ontologies to organize and describe the medical concepts of both the source system and the target system. Instead of using unique, specifically developed SQL statements or ETL jobs, we define declarative transformation rules within ontologies and illustrate how these constructs can then be used to automatically generate SQL code to perform the desired ETL procedures. This demonstrates how a suitable level of abstraction may not only aid the interpretation of clinical data, but can also foster the reutilization of methods for un-locking it.
电子病历中的数据包含大量结构化但未编码的元素,这些元素未与标准术语相关联。最近,将此类数据用于二次研究目的的重要性日益凸显。然而,识别相关数据元素以及创建用于提取、转换和加载(ETL)的数据库作业具有挑战性:使用诸如数据仓库等当前方法,高效维护和重用语义复杂的数据提取和转换例程是不可行的。我们提出一种本体支持的方法来克服这一挑战,即利用抽象:我们不是在数据库级别定义ETL过程,而是使用本体来组织和描述源系统和目标系统的医学概念。我们不是使用独特的、专门开发的SQL语句或ETL作业,而是在本体中定义声明性转换规则,并说明如何使用这些结构自动生成SQL代码来执行所需的ETL过程。这表明适当的抽象级别不仅有助于解释临床数据,还能促进解锁临床数据方法的重用。