Lund Katrine Hjuler, Fuglsang Cecilia Hvitfeldt, Schmidt Sigrun Alba Johannesdottir, Schmidt Morten
Department of Clinical Epidemiology, Aarhus University Hospital, Aarhus, Denmark.
Department of Clinical Medicine, Aarhus University, Aarhus, Denmark.
Clin Epidemiol. 2024 Dec 10;16:865-900. doi: 10.2147/CLEP.S471335. eCollection 2024.
The increasing use of routinely collected health data for research puts great demands on data quality. The Danish National Patient Registry (DNPR) is renowned for its longitudinal data registration since 1977 and is a commonly used data source for cardiovascular epidemiology.
To provide an overview and examine determinants of the cardiovascular data quality in the DNPR.
We performed a systematic literature search of MEDLINE (PubMed) and the Danish Medical Journal, and identified papers validating cardiovascular variables in the DNPR during 1977-2024. We also included papers from reference lists, citations, journal e-mail notifications, and colleagues. Measures of data quality included the positive predictive value (PPV), negative predictive value, sensitivity, and specificity.
We screened 2,049 papers to identify 63 relevant papers, including a total of 229 cardiovascular variables. Of these, 200 variables assessed diagnoses, 24 assessed treatments (10 surgeries and 14 other treatments), and 5 assessed examinations. The data quality varied substantially between variables. Overall, the PPV was ≥90% for 36% of variables, 80-89% for 26%, 70-79% for 16%, 60-69% for 7%, 50-59% for 4%, and <50% for 11% of variables. The predictive value was generally higher for treatments (PPV≥95% for 92%) and examinations (PPV≥95% for 100%) than for diagnoses (PPV≥80% for 71%). Moreover, the PPV varied for individual diagnoses depending on the algorithm used to identify them. Key determinants for validity were patient contact type (inpatient vs outpatient), diagnosis type (primary vs secondary), setting (university vs regional hospitals), and calendar year.
The validity of cardiovascular variables in the DNPR is high for treatments and examinations but varies considerably between individual diagnoses depending on the algorithm used to define them.
将常规收集的健康数据用于研究的情况日益增多,这对数据质量提出了很高要求。丹麦国家患者登记处(DNPR)自1977年以来一直以其纵向数据登记而闻名,是心血管流行病学常用的数据源。
概述并研究DNPR中心血管数据质量的决定因素。
我们对MEDLINE(PubMed)和《丹麦医学杂志》进行了系统的文献检索,确定了1977年至2024年期间验证DNPR中心血管变量的论文。我们还纳入了参考文献列表、引用文献、期刊电子邮件通知以及同事提供的论文。数据质量的衡量指标包括阳性预测值(PPV)、阴性预测值、敏感性和特异性。
我们筛选了2049篇论文,以确定63篇相关论文,共涉及229个心血管变量。其中,200个变量评估诊断,24个评估治疗(10项手术和14项其他治疗),5个评估检查。不同变量的数据质量差异很大。总体而言,36%的变量PPV≥90%,26%的变量PPV为80 - 89%,16%的变量PPV为70 - 79%,7%的变量PPV为60 - 69%,4%的变量PPV为50 - 59%,11%的变量PPV<50%。治疗(92%的治疗PPV≥95%)和检查(100%的检查PPV≥95%)的预测值通常高于诊断(71%的诊断PPV≥80%)。此外,根据用于识别诊断的算法不同,各个诊断的PPV也有所不同。有效性的关键决定因素包括患者接触类型(住院患者与门诊患者)、诊断类型(原发性与继发性)、医疗机构(大学医院与地区医院)和日历年份。
DNPR中心血管变量在治疗和检查方面的有效性较高,但根据用于定义诊断的算法不同,各个诊断之间的有效性差异很大。