Stausberg Jürgen, Kuklik Nils, Jöckel Karl-Heinz
University of Duisburg-Essen, Faculty of Medicine, Institute for Medical Informatics, Biometry and Epidemiology (IMIBE), Germany.
Stud Health Technol Inform. 2018;247:566-570.
Several dimensions of data quality are described in the literature. One overriding aspect is considered to be the extent to which data represent the truth which is captured by data validity. Unfortunately, a common terminology, well defined concepts, and approved measures are missing in regard to data validity. In particular, there is a need to discuss the gold standard as reference for the data at hand and respective measures. Ultimate gold standard would be the state of the patient which itself is subjected to human and personal interpretations. Usually, an often diverse form of source data is used as gold standard. Based on the concept of the measure, it might be inappropriate differentiating between present and absent while calculating precision and recall. Due to the complexity and uncertainty of many health care related issues, a more sophisticated comparison might be necessary in order to establish relevant and general figures of data quality. Unfortunately, a harmonization in this field is not visible. Further research is needed to establish validated standards to measure data quality.