Ding Rong, Cook Sarah, Stone Philip W, Srirathan Dharun, Shyam Yashwin, Anand Ruhan, Sudharshan Palaniappa, Quint Jennifer K
School of Public Health, Imperial College London, London, UK.
Faculty of Public Health and Policy, London School of Hygiene and Tropical Medicine, London, UK.
Clin Epidemiol. 2025 Sep 9;17:753-764. doi: 10.2147/CLEP.S529563. eCollection 2025.
Vaping and smoking are important health behaviours associated with many diseases. Evaluating the association of smoking and vaping with diseases using electronic health record (EHR) data requires accurate codelists to determine smoking and vaping status. However, codelists used in studies are not always published or consistent between studies. It is important to develop standard codelists for use in future studies, and transparency is required to ensure consistency and standardization.
To provide an overview of the codes used in both peer-reviewed scientific literature and codelist repositories to identify smoking and vaping status in EHRs and derive a recommended codelist for use in EHRs to identify smoking and vaping status.
Publications (MEDLINE, Embase, and Scopus) and codelist repositories (LSHTM Data Compass, OpenCodelists, and the HDR UK Phenotype Library) were searched from January 2010 to April 2024. All publications or codelist repositories with codes referring to smoking/vaping status were included in this review (search terms are further addressed in Supplementary Table 1). All codes were extracted to review the frequency and consistency between studies.
There were 100 codelists across different coding systems: 55 codelists from publications and 45 codelists from codelist repository entries. For vaping status, there were 23 codelists identified, 7 from publications, and 16 from codelist repositories. Only 10% of publications included codelists. A limited number of ICD codes were used, and more were reported using the Read or SNOMED CT codes. The codelists we subsequently developed were based on those found in the review.
Very few studies have reported the use of codelists despite smoking status being a widely used variable in many publications, and vaping status is increasing. Using the information from the review, we derived codelists for smoking and vaping using a transparent methodology that can be used in future studies.
吸电子烟和吸烟是与多种疾病相关的重要健康行为。利用电子健康记录(EHR)数据评估吸烟和吸电子烟与疾病之间的关联,需要准确的编码列表来确定吸烟和吸电子烟状态。然而,研究中使用的编码列表并不总是公开的,且研究之间也不一致。制定标准编码列表以供未来研究使用非常重要,并且需要透明度以确保一致性和标准化。
概述同行评审科学文献和编码列表存储库中用于识别电子健康记录中吸烟和吸电子烟状态的代码,并得出用于电子健康记录以识别吸烟和吸电子烟状态的推荐编码列表。
检索2010年1月至2024年4月期间的出版物(MEDLINE、Embase和Scopus)和编码列表存储库(伦敦卫生与热带医学院数据指南针、OpenCodelists和英国卫生数据研究中心表型库)。所有提及吸烟/吸电子烟状态代码的出版物或编码列表存储库均纳入本综述(补充表1中进一步列出检索词)。提取所有代码以审查研究之间的频率和一致性。
共有100个不同编码系统的编码列表:55个来自出版物,45个来自编码列表存储库条目。对于吸电子烟状态,共识别出23个编码列表,7个来自出版物,16个来自编码列表存储库。只有10%的出版物包含编码列表。使用的国际疾病分类(ICD)代码数量有限,更多的是使用Read或医学系统命名法临床术语(SNOMED CT)代码报告的。我们随后制定的编码列表基于综述中发现的那些。
尽管吸烟状态在许多出版物中是一个广泛使用的变量,且吸电子烟状态也在增加,但很少有研究报告使用编码列表。利用综述中的信息,我们采用透明方法得出了吸烟和吸电子烟的编码列表,可用于未来研究。