Ji Limei, Geraedts Max, de Cruppé Werner
Institute for Health Services Research and Clinical Epidemiology, Philipps-Universität Marburg, Karl-von-Frisch-Strasse 4, Marburg, 35043, Germany.
BMC Med Res Methodol. 2024 Dec 31;24(1):325. doi: 10.1186/s12874-024-02429-6.
Health services research often relies on secondary data, necessitating quality checks for completeness, validity, and potential errors before use. Various methods address implausible data, including data elimination, statistical estimation, or value substitution from the same or another dataset. This study presents an internal validation process of a secondary dataset used to investigate hospital compliance with minimum caseload requirements (MCR) in Germany. The secondary data source validated is the German Hospital Quality Reports (GHQR), an official dataset containing structured self-reported data from all hospitals in Germany.
This study conducted an internal cross-field validation of MCR-related data in GHQR from 2016 to 2021. The validation process checked the validity of reported MCR caseloads, including data availability and consistency, by comparing the stated MCR caseload with further variables in the GHQR. Subsequently, implausible MCR caseload values were corrected using the most plausible values given in the same GHQR. The study also analysed the error sources and used reimbursement-related Diagnosis Related Groups Statistic data to assess the validation outcomes.
The analysis focused on four MCR procedures. 11.8-27.7% of the total MCR caseload values in the GHQR appeared ambiguous, and 7.9-23.7% were corrected. The correction added 0.7-3.7% of cases not previously stated as MCR caseloads and added 1.5-26.1% of hospital sites as MCR performing hospitals not previously stated in the GHQR. The main error source was this non-reporting of MCR caseloads, especially by hospitals with low case numbers. The basic plausibility control implemented by the Federal Joint Committee since 2018 has improved the MCR-related data quality over time.
This study employed a comprehensive approach to dataset internal validation that encompassed: (1) hospital association level data, (2) hospital site level data and (3) medical department level data, (4) report data spanning six years, and (5) logical plausibility checks. To ensure data completeness, we selected the most plausible values without eliminating incomplete or implausible data. For future practice, we recommend a validation process when using GHQR as a data source for MCR-related research. Additionally, an adapted plausibility control could help to improve the quality of MCR documentation.
卫生服务研究通常依赖二手数据,因此在使用前需要对数据的完整性、有效性和潜在错误进行质量检查。有多种方法可处理不合理的数据,包括数据剔除、统计估计或从同一数据集或其他数据集中进行值替换。本研究展示了一个用于调查德国医院最低病例数要求(MCR)合规情况的二手数据集的内部验证过程。经验证的二手数据源是德国医院质量报告(GHQR),这是一个官方数据集,包含来自德国所有医院的结构化自我报告数据。
本研究对2016年至2021年GHQR中与MCR相关的数据进行了内部跨领域验证。验证过程通过将报告的MCR病例数与GHQR中的其他变量进行比较,检查报告的MCR病例数的有效性,包括数据可用性和一致性。随后,使用同一GHQR中给出的最合理值对不合理的MCR病例数进行校正。该研究还分析了误差来源,并使用与报销相关的诊断相关分组统计数据来评估验证结果。
分析聚焦于四项MCR程序。GHQR中MCR病例数总值的11.8% - 27.7%显得不明确,7.9% - 23.7%得到了校正。校正后增加了0.7% - 3.7%之前未被列为MCR病例数的病例,并增加了1.5% - 26.1%之前在GHQR中未被列为执行MCR的医院地点。主要误差来源是未报告MCR病例数,尤其是病例数较少的医院。自2018年以来联邦联合委员会实施的基本合理性控制随着时间推移提高了与MCR相关的数据质量。
本研究采用了一种全面的数据集内部验证方法,该方法涵盖:(1)医院协会层面数据,(2)医院地点层面数据,(3)医疗部门层面数据,(4)跨越六年的报告数据,以及(5)逻辑合理性检查。为确保数据完整性,我们选择了最合理的值,而没有剔除不完整或不合理的数据。对于未来的实践,我们建议在将GHQR用作与MCR相关研究的数据源时进行验证过程。此外,调整后的合理性控制可能有助于提高MCR文档的质量。