2720The University of Western Australia, Australia.
2541Monash University, Australia.
Health Inf Manag. 2021 Jan-May;50(1-2):64-75. doi: 10.1177/1833358320908957. Epub 2020 Mar 27.
Data quality is fundamental to the integrity of quantitative research. The role of external researchers in data quality assessment (DQA) remains ill-defined in the context of secondary use for research of large, centrally curated health datasets. In order to investigate equity of palliative care provided to Indigenous Australian patients, researchers accessed a now-historical version of a national palliative care dataset developed primarily for the purpose of continuous quality improvement.
(i) To apply a generic DQA framework to the dataset and (ii) to report the process and results of this assessment and examine the consequences for conducting the research.
The data were systematically examined for completeness, consistency and credibility. Data quality issues relevant to the Indigenous identifier and framing of research questions were of particular interest.
The dataset comprised 477,518 records of 144,951 patients (Indigenous = 1515; missing Indigenous identifier = 4998) collected from participating specialist palliative care services during a period (1 January 2010-30 June 2015) in which data-checking systems underwent substantial upgrades. Progressive improvement in completeness of data over the study period was evident. The data were error-free with respect to many credibility and consistency checks, with anomalies detected reported to data managers. As the proportion of missing values remained substantial for some clinical care variables, multiple imputation procedures were used in subsequent analyses.
In secondary use of large curated datasets, DQA by external researchers may both influence proposed analytical methods and contribute to improvement of data curation processes through feedback to data managers.
数据质量是定量研究完整性的基础。在对大型集中式健康数据集进行研究的二次使用中,外部研究人员在数据质量评估 (DQA) 中的作用仍未得到明确界定。为了调查向澳大利亚原住民患者提供姑息治疗的公平性,研究人员访问了一个全国姑息治疗数据集的历史版本,该数据集主要是为了持续质量改进而开发的。
(i) 将通用 DQA 框架应用于该数据集,以及 (ii) 报告此评估的过程和结果,并检查对进行研究的影响。
系统地检查数据的完整性、一致性和可信度。与原住民标识符和研究问题框架相关的数据质量问题特别受到关注。
该数据集包含 144951 名患者的 477518 条记录(原住民 = 1515;缺失原住民标识符 = 4998),这些患者是从参与的专科姑息治疗服务中收集的,时间跨度为 2010 年 1 月 1 日至 2015 年 6 月 30 日,在此期间数据检查系统经历了重大升级。在研究期间,数据的完整性逐渐提高。就许多可信度和一致性检查而言,数据是无错误的,检测到的异常已报告给数据管理员。由于一些临床护理变量的缺失值比例仍然很大,因此在随后的分析中使用了多次插补程序。
在大型已整理数据集的二次使用中,外部研究人员的 DQA 既可能影响拟议的分析方法,又可能通过向数据管理员提供反馈来改进数据整理过程。