Nuffield Department of Population Health, University of Oxford, Oxford, UK.
UK Biobank Ltd, Stockport, UK.
Eur J Epidemiol. 2024 Feb;39(2):219-229. doi: 10.1007/s10654-023-01095-0. Epub 2024 Jan 16.
The UK Biobank has made general practitioner (GP) data (censoring date 2016-2017) available for approximately 45% of the cohort, whilst hospital inpatient and death registry (referred to as "HES/Death") data are available cohort-wide through 2018-2022 depending on whether the data comes from England, Wales or Scotland. We assessed the importance of case ascertainment via different data sources in UKB for three diseases that are usually first diagnosed in primary care: Parkinson's disease (PD), type 2 diabetes (T2D), and all-cause dementia. Including GP data at least doubled the number of incident cases in the subset of the cohort with primary care data (e.g. from 619 to 1390 for dementia). Among the 786 dementia cases that were only captured in the GP data before the GP censoring date, only 421 (54%) were subsequently recorded in HES. Therefore, estimates of the absolute incidence or risk-stratified incidence are misleadingly low when based only on the HES/Death data. For incident cases present in both HES/Death and GP data during the full follow-up period (i.e. until the HES censoring date), the median time difference between an incident diagnosis of dementia being recorded in GP and HES/Death was 2.25 years (i.e. recorded 2.25 years earlier in the GP records). Similar lag periods were also observed for PD (median 2.31 years earlier) and T2D (median 2.82 years earlier). For participants with an incident GP diagnosis, only 65.6% of dementia cases, 69.0% of PD cases, and 58.5% of T2D cases had their diagnosis recorded in HES/Death within 7 years since GP diagnosis. The effect estimates (hazard ratios, HR) of established risk factors for the three health outcomes mostly remain in the same direction and with a similar strength of association when cases are ascertained either using HES only or further adding GP data. The confidence intervals of the HR became narrower when adding GP data, due to the increased statistical power from the additional cases. In conclusion, it is desirable to extend both the coverage and follow-up period of GP data to allow researchers to maximise case ascertainment of chronic health conditions in the UK.
英国生物银行(UK Biobank)提供了大约 45%队列的全科医生(GP)数据(截止日期为 2016-2017 年),而根据数据来自英格兰、威尔士还是苏格兰,2018-2022 年期间可通过医院住院和死亡登记处(称为“HES/Death”)获得全队列的数据。我们评估了通过 UKB 中不同数据源对三种通常在初级保健中首次诊断的疾病进行病例确认的重要性:帕金森病(PD)、2 型糖尿病(T2D)和全因痴呆症。将 GP 数据纳入至少使初级保健数据亚组中的新发病例数量增加了一倍(例如,痴呆症从 619 例增加到 1390 例)。在 GP 数据截止日期之前仅在 GP 数据中捕获的 786 例痴呆症病例中,只有 421 例(54%)随后在 HES 中记录。因此,仅基于 HES/Death 数据,绝对发病率或风险分层发病率的估计值低得令人误解。对于在整个随访期间(即直到 HES 截止日期)同时存在于 HES/Death 和 GP 数据中的新发病例,在 GP 中记录痴呆症新发病例与在 HES/Death 中记录之间的中位时间差为 2.25 年(即,在 GP 记录中早记录 2.25 年)。PD(中位数早 2.31 年)和 T2D(中位数早 2.82 年)也观察到类似的滞后期。对于有新发病例 GP 诊断的参与者,只有 65.6%的痴呆症病例、69.0%的 PD 病例和 58.5%的 T2D 病例在 GP 诊断后 7 年内其诊断在 HES/Death 中记录。当仅使用 HES 或进一步添加 GP 数据来确定病例时,三种健康结果的既定危险因素的效应估计值(风险比,HR)大多保持在相同的方向,并且关联强度相似。由于额外病例增加了统计效力,因此添加 GP 数据后 HR 的置信区间变窄。总之,最好扩大 GP 数据的覆盖范围和随访期,以允许研究人员在英国最大限度地确定慢性健康状况的病例。