Clinical Practice Research Datalink, Medicines and Healthcare Products Regulatory Agency, London, UK
Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, UK.
BMJ Open. 2024 Feb 14;14(2):e070258. doi: 10.1136/bmjopen-2022-070258.
To explore whether UK primary care databases arising from two different software systems can be feasibly combined, by comparing rates of Huntington's disease (HD, which is rare) and 14 common cancers in the two databases, as well as characteristics of people with these conditions.
Descriptive study.
Primary care electronic health records from Clinical Practice Research Datalink (CPRD) GOLD and CPRD Aurum databases, with linked hospital admission and death registration data.
4986 patients with HD and 1 294 819 with an incident cancer between 1990 and 2019.
Incidence and prevalence of HD by calendar period, age group and region, and annual age-standardised incidence of 14 common cancers in each database, and in a subset of 'overlapping' practices which contributed to both databases. Characteristics of patients with HD or incident cancer: medical history, recent prescribing, healthcare contacts and database follow-up.
Incidence and prevalence of HD were slightly higher in CPRD GOLD than CPRD Aurum, but with similar trends over time. Cancer incidence in the two databases differed between 1990 and 2000, but converged and was very similar thereafter. Participants in each database were most similar in terms of medical history (median standardised difference, MSD 0.03 (IQR 0.01-0.03)), recent prescribing (MSD 0.06 (0.03-0.10)) and demographics and general health variables (MSD 0.05 (0.01-0.09)). Larger differences were seen for healthcare contacts (MSD 0.27 (0.10-0.41)), and database follow-up (MSD 0.39 (0.19-0.56)).
Differences in cancer incidence trends between 1990 and 2000 may relate to use of a practice-level data quality filter (the 'up-to-standard' date) in CPRD GOLD only. As well as the impact of data curation methods, differences in underlying data models can make it more challenging to define exactly equivalent clinical concepts in each database. Researchers should be aware of these potential sources of variability when planning combined database studies and interpreting results.
通过比较两个数据库中亨廷顿病(HD,罕见)和 14 种常见癌症的发病率,以及这些疾病患者的特征,探讨能否将源自两个不同软件系统的英国初级保健数据库进行组合。
描述性研究。
临床实践研究数据链接(CPRD)GOLD 和 CPRD 金数据库中的初级保健电子健康记录,以及链接的住院和死亡登记数据。
1990 年至 2019 年间,4986 名 HD 患者和 1294819 名癌症患者。
按日历时间、年龄组和地区划分的 HD 发病率和患病率,以及每个数据库和两个数据库都有贡献的“重叠”实践子集的 14 种常见癌症的年度年龄标准化发病率。HD 或偶发性癌症患者的特征:病史、最近的处方、医疗保健接触和数据库随访。
CPRD GOLD 中的 HD 发病率和患病率略高于 CPRD 金,但随时间的推移呈相似趋势。两个数据库之间的癌症发病率在 1990 年至 2000 年之间存在差异,但在 2000 年后趋同,且非常相似。每个数据库的参与者在病史方面最为相似(中位数标准化差异,MSD 0.03(0.01-0.03))、最近的处方(MSD 0.06(0.03-0.10))和人口统计学和一般健康变量(MSD 0.05(0.01-0.09))。医疗保健接触(MSD 0.27(0.10-0.41))和数据库随访(MSD 0.39(0.19-0.56))的差异更大。
1990 年至 2000 年之间癌症发病率趋势的差异可能与 CPRD GOLD 中仅使用实践级数据质量过滤器(“达标”日期)有关。除了数据管理方法的影响外,基础数据模型的差异使得更难以在每个数据库中准确定义完全相同的临床概念。在规划联合数据库研究和解释结果时,研究人员应该意识到这些潜在的变异性来源。