Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Hannover Medical School, Hannover, Niedersachsen, Germany.
Big Data in Medicine, Department of Health Services Research, School of Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, Oldenburg, Niedersachsen, Germany.
Methods Inf Med. 2023 Jun;62(S 01):e1-e9. doi: 10.1055/s-0042-1760238. Epub 2023 Jan 11.
Data quality issues can cause false decisions of clinical decision support systems (CDSSs). Analyzing local data quality has the potential to prevent data quality-related failure of CDSS adoption.
To define a shareable set of applicable measurement methods (MMs) for a targeted data quality assessment determining the suitability of local data for our CDSS.
We derived task-specific MMs using four approaches: (1) a GUI-based data quality analysis using the open source tool . (2) Analyzing cases of known false CDSS decisions. (3) Data-driven learning on MM-results. (4) A systematic check to find blind spots in our set of MMs based on the data quality framework. We expressed the derived data quality-related knowledge about the CDSS using the 5-tuple-formalization for MMs.
We identified some task-specific dataset characteristics that a targeted data quality assessment for our use case should inspect. Altogether, we defined 394 MMs organized in 13 data quality knowledge bases.
We have created a set of shareable, applicable MMs that can support targeted data quality assessment for CDSS-based systemic inflammatory response syndrome (SIRS) detection in critically ill, pediatric patients. With the demonstrated approaches for deriving and expressing task-specific MMs, we intend to help promoting targeted data quality assessment as a commonly recognized usual part of research on data-consuming application systems in health care.
数据质量问题可能导致临床决策支持系统(CDSS)做出错误决策。分析本地数据质量有可能防止因数据质量问题而导致 CDSS 采用失败。
定义一套可共享的适用测量方法(MM),用于针对数据质量评估,以确定本地数据是否适合我们的 CDSS。
我们使用了四种方法来推导出特定于任务的 MM:(1)使用开源工具 进行基于 GUI 的数据质量分析。(2)分析已知 CDSS 错误决策的案例。(3)基于 MM 结果的数据驱动学习。(4)根据数据质量框架,对 MM 集进行系统检查,以发现盲点。我们使用 MM 的 5 元组形式表达了与 CDSS 相关的数据质量知识。
我们确定了一些针对我们用例的目标数据质量评估应该检查的特定于任务的数据集特征。总共,我们定义了 394 个 MM,组织在 13 个数据质量知识库中。
我们创建了一套可共享、适用的 MM,可以支持基于 CDSS 的系统炎症反应综合征(SIRS)检测的目标数据质量评估,适用于重症、儿科患者。通过展示用于推导出和表达特定于任务的 MM 的方法,我们旨在帮助促进作为医疗保健中数据密集型应用系统研究中公认的常规部分的目标数据质量评估。