Hatam Sara, Scully Sean Timothy, Cook Sarah, Evans Hywel T, Hume Alastair, Kallis Constantinos, Farr Ian, Orton Chris, Sheikh Aziz, Quint Jennifer K
Usher Institute, The University of Edinburgh, Edinburgh, UK.
Population Data Science, Swansea University Medical School, Swansea, UK.
Clin Epidemiol. 2024 Apr 4;16:235-247. doi: 10.2147/CLEP.S437937. eCollection 2024.
Electronic healthcare records (EHRs) are an important resource for health research that can be used to improve patient outcomes in chronic respiratory diseases. However, consistent approaches in the analysis of these datasets are needed for coherent messaging, and when undertaking comparative studies across different populations.
We developed a harmonised curation approach to generate comparable patient cohorts for asthma, chronic obstructive pulmonary disease (COPD) and interstitial lung disease (ILD) using datasets from within Clinical Practice Research Datalink (CPRD; for England), Secure Anonymised Information Linkage (SAIL; for Wales) and DataLoch (for Scotland) by defining commonly derived variables consistently between the datasets. By working in parallel on the curation methodology used for CPRD, SAIL and DataLoch for asthma, COPD and ILD, we were able to highlight key differences in coding and recording between the databases and identify solutions to enable valid comparisons.
Codelists and metadata generated have been made available to help re-create the asthma, COPD and ILD cohorts in CPRD, SAIL and DataLoch for different time periods, and provide a starting point for the curation of respiratory datasets in other EHR databases, expediting further comparable respiratory research.
电子健康记录(EHRs)是健康研究的重要资源,可用于改善慢性呼吸道疾病患者的治疗效果。然而,在对这些数据集进行分析时,需要采用一致的方法,以便进行连贯的信息传递,以及在对不同人群进行比较研究时。
我们开发了一种统一的整理方法,通过在数据集之间一致地定义共同衍生变量,利用临床实践研究数据链(CPRD;用于英格兰)、安全匿名信息链接(SAIL;用于威尔士)和DataLoch(用于苏格兰)中的数据集,为哮喘、慢性阻塞性肺疾病(COPD)和间质性肺疾病(ILD)生成可比的患者队列。通过并行处理用于CPRD、SAIL和DataLoch中哮喘、COPD和ILD的整理方法,我们能够突出数据库之间编码和记录的关键差异,并确定实现有效比较的解决方案。
已提供生成的代码列表和元数据,以帮助在不同时间段内在CPRD、SAIL和DataLoch中重新创建哮喘、COPD和ILD队列,并为其他EHR数据库中呼吸道数据集的整理提供起点,加快进一步的可比呼吸道研究。