Suppr超能文献

改进初级保健 EMR 数据库中吸烟记录质量的方法:探索多种插补和模式匹配算法。

Methods to improve the quality of smoking records in a primary care EMR database: exploring multiple imputation and pattern-matching algorithms.

机构信息

Department of Family Medicine, University of Calgary, G012 Health Sciences Centre, 3330 Hospital Drive NW, Calgary, Alberta, T2N 4N1, Canada.

Department of Community Health Sciences, University of Calgary, 3280 Hospital Drive NW, Calgary, Alberta, T2N 4Z6, Canada.

出版信息

BMC Med Inform Decis Mak. 2020 Mar 14;20(1):56. doi: 10.1186/s12911-020-1068-5.

Abstract

BACKGROUND

Primary care electronic medical record (EMR) data are emerging as a useful source for secondary uses, such as disease surveillance, health outcomes research, and practice improvement. These data capture clinical details about patients' health status, as well as behavioural risk factors, such as smoking. While the importance of documenting smoking status in a healthcare setting is recognized, the quality of smoking data captured in EMRs is variable. This study was designed to test methods aimed at improving the quality of patient smoking information in a primary care EMR database.

METHODS

EMR data from community primary care settings extracted by two regional practice-based research networks in Alberta, Canada were used. Patients with at least one encounter in the previous 2 years (2016-2018) and having hypertension according to a validated definition were included (n = 48,377). Multiple imputation was tested under two different assumptions for missing data (smoking status is missing at random and missing not-at-random). A third method tested a novel pattern matching algorithm developed to augment smoking information in the primary care EMR database. External validity was examined by comparing the proportions of smoking categories generated in each method with a general population survey.

RESULTS

Among those with hypertension, 40.8% (n = 19,743) had either no smoking information recorded or it was not interpretable and considered missing. Those with missing smoking data differed statistically by demographics, clinical features, and type of EMR system used in the clinic. Both multiple imputation methods produced fully complete smoking status information, with the proportion of current smokers estimated at 25.3% (data missing at random) and 12.5% (data missing not-at-random). The pattern-matching algorithm classified 18.2% of patients as current smokers, similar to the population-based survey (18.9%), but still resulted in missing smoking information for 23.6% of patients. The algorithm was estimated to be 93.8% accurate overall, but varied by smoking status category.

CONCLUSION

Multiple imputation and algorithmic pattern-matching can be used to improve EMR data post-extraction but the recommended method depends on the purpose of secondary use (e.g. practice improvement or epidemiological analyses).

摘要

背景

初级保健电子病历(EMR)数据作为二次利用的有用来源正在出现,例如疾病监测、健康结果研究和实践改进。这些数据捕捉了患者健康状况的临床细节,以及行为风险因素,如吸烟。虽然在医疗保健环境中记录吸烟状况的重要性是公认的,但 EMR 中捕获的吸烟数据的质量是可变的。本研究旨在测试旨在提高初级保健 EMR 数据库中患者吸烟信息质量的方法。

方法

使用加拿大阿尔伯塔省两个区域基于实践的研究网络提取的社区初级保健环境中的 EMR 数据。纳入至少在前 2 年(2016-2018 年)有一次就诊且根据验证的定义患有高血压的患者(n=48377)。对于缺失数据(吸烟状况随机缺失和非随机缺失),测试了两种不同的多重插补假设。第三种方法测试了一种新的模式匹配算法,用于扩充初级保健 EMR 数据库中的吸烟信息。通过将每种方法生成的吸烟类别比例与一般人群调查进行比较,来检验外部有效性。

结果

在患有高血压的患者中,40.8%(n=19743)要么没有记录吸烟信息,要么无法解释,被认为是缺失的。那些有缺失吸烟数据的患者在人口统计学、临床特征和诊所使用的 EMR 系统类型方面存在统计学差异。两种多重插补方法都产生了完整的吸烟状况信息,当前吸烟者的比例估计为 25.3%(随机缺失)和 12.5%(非随机缺失)。模式匹配算法将 18.2%的患者归类为当前吸烟者,与基于人群的调查相似(18.9%),但仍导致 23.6%的患者有缺失吸烟信息。该算法总体上估计准确率为 93.8%,但因吸烟状况类别而异。

结论

多重插补和算法模式匹配可用于提取后改进 EMR 数据,但推荐的方法取决于二次使用的目的(例如实践改进或流行病学分析)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/355a/7071570/c3481595446a/12911_2020_1068_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验