Suppr超能文献

基于本体的方法开发用于欧洲癌症登记的协调数据验证工具。

An ontology-based approach for developing a harmonised data-validation tool for European cancer registration.

机构信息

European Commission, Joint Research Centre, Via E. Fermi 2749, I-21027, Ispra, VA, Italy.

出版信息

J Biomed Semantics. 2021 Jan 6;12(1):1. doi: 10.1186/s13326-020-00233-x.

Abstract

BACKGROUND

Population-based cancer registries constitute an important information source in cancer epidemiology. Studies collating and comparing data across regional and national boundaries have proved important for deploying and evaluating effective cancer-control strategies. A critical aspect in correctly comparing cancer indicators across regional and national boundaries lies in ensuring a good and harmonised level of data quality, which is a primary motivator for a centralised collection of pseudonymised data. The recent introduction of the European Union's general data-protection regulation (GDPR) imposes stricter conditions on the collection, processing, and sharing of personal data. It also considers pseudonymised data as personal data. The new regulation motivates the need to find solutions that allow a continuation of the smooth processes leading to harmonised European cancer-registry data. One element in this regard would be the availability of a data-validation software tool based on a formalised depiction of the harmonised data-validation rules, allowing an eventual devolution of the data-validation process to the local level.

RESULTS

A semantic data model was derived from the data-validation rules for harmonising cancer-data variables at European level. The data model was encapsulated in an ontology developed using the Web-Ontology Language (OWL) with the data-model entities forming the main OWL classes. The data-validation rules were added as axioms in the ontology. The reasoning function of the resulting ontology demonstrated its ability to trap registry-coding errors and in some instances to be able to correct errors.

CONCLUSIONS

Describing the European cancer-registry core data set in terms of an OWL ontology affords a tool based on a formalised set of axioms for validating a cancer-registry's data set according to harmonised, supra-national rules. The fact that the data checks are inherently linked to the data model would lead to less maintenance overheads and also allow automatic versioning synchronisation, important for distributed data-quality checking processes.

摘要

背景

基于人群的癌症登记处是癌症流行病学的重要信息来源。将数据在区域和国家边界进行整理和比较的研究对于部署和评估有效的癌症控制策略非常重要。正确比较区域和国家边界的癌症指标的一个关键方面在于确保数据质量达到良好且协调的水平,这是集中收集匿名数据的主要动机。最近引入的欧盟一般数据保护条例(GDPR)对个人数据的收集、处理和共享施加了更严格的条件。它还将匿名数据视为个人数据。新法规促使我们需要找到解决方案,以确保协调一致的欧洲癌症登记处数据的顺利流程得以继续。这方面的一个要素是提供一种基于正式描述的协调数据验证规则的数据验证软件工具,允许最终将数据验证过程下放给地方一级。

结果

从协调癌症数据变量的欧洲层面的数据验证规则中得出了语义数据模型。该数据模型被封装在使用 Web 本体语言 (OWL) 开发的本体中,数据模型实体构成了主要的 OWL 类。数据验证规则作为本体中的公理添加。由此产生的本体的推理功能证明了它能够捕捉登记编码错误,并且在某些情况下能够纠正错误。

结论

用 OWL 本体术语描述欧洲癌症登记核心数据集提供了一种工具,该工具基于一组正式的公理,根据协调的、超国家的规则验证癌症登记数据集。数据检查与数据模型内在相关的事实将导致更少的维护开销,并允许自动版本同步,这对于分布式数据质量检查过程非常重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ad0/7789225/c9a4b8561452/13326_2020_233_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验