Population Data Science, Swansea University Medical School, Swansea, UK
Population Data Science, Swansea University Medical School, Swansea, UK.
BMJ Open. 2024 Aug 3;14(8):e077675. doi: 10.1136/bmjopen-2023-077675.
This study aims to create a national ethnicity spine based on all available ethnicity records in linkable anonymised electronic health record and administrative data sources.
A longitudinal study using anonymised individual-level population-scale ethnicity data from 26 data sources available within the Secure Anonymised Information Linkage Databank.
The national ethnicity spine is created based on longitudinal national data for the population of Wales-UK over 22 years (between 2000 and 2021).
A total of 46 million ethnicity records for 4 297 694 individuals have been extracted, harmonised, deduplicated and made available within a longitudinal research ready data asset.
(1) Comparing the distribution of ethnicity records over time for four different selection approaches (latest, mode, weighted mode and composite) across age bands, sex, deprivation quintiles, health board and residential location and (2) distribution and completeness of records against the ONS census 2011.
The distribution of the dominant group (white) is minimally affected based on the four different selection approaches. Across all other ethnic group categorisations, the mixed group was most susceptible to variation in distribution depending on the selection approach used and varied from a 0.6% prevalence across the latest and mode approach to a 1.1% prevalence for the weighted mode, compared with the 3.1% prevalence for the composite approach. Substantial alignment was observed with ONS 2011 census with the Latest group method (kappa=0.68, 95% CI (0.67 to 0.71)) across all subgroups. The record completeness rate was over 95% in 2021.
In conclusion, our development of the population-scale ethnicity spine provides robust ethnicity measures for healthcare research in Wales and a template which can easily be deployed in other trusted research environments in the UK and beyond.
本研究旨在基于可链接匿名电子健康记录和管理数据源中所有可用的种族记录,创建一个全国种族脊柱。
一项使用来自 Secure Anonymised Information Linkage Databank 中 26 个数据源的匿名个体水平人口规模种族数据的纵向研究。
该全国种族脊柱是基于威尔士英国 22 年(2000 年至 2021 年)人口的全国纵向数据创建的。
共提取了 4600 万条种族记录,涉及 4297694 个人,这些记录经过协调、去重,并在一个纵向研究就绪的数据资产中提供。
(1)比较四种不同选择方法(最新、模式、加权模式和综合模式)在不同年龄组、性别、贫困五分位数、保健委员会和居住地点随时间推移的种族记录分布;(2)与 2011 年 ONS 人口普查相比,记录的分布和完整性。
基于四种不同的选择方法,主要群体(白人)的分布受影响最小。在所有其他种族群体分类中,混合群体的分布最容易受到所使用的选择方法的影响,从最新和模式方法的 0.6%流行率到加权模式的 1.1%流行率,与综合方法的 3.1%流行率相比,变化较大。与 ONS 2011 年人口普查相比,最新群体方法在所有亚组中都具有高度一致性(kappa=0.68,95%置信区间(0.67 至 0.71))。2021 年的记录完整率超过 95%。
总之,我们开发的人口规模种族脊柱为威尔士的医疗保健研究提供了可靠的种族衡量标准,并且可以轻松部署在英国和其他值得信赖的研究环境中。