Suppr超能文献

从临床记录中提取多种吸烟行为来建立一个烟草使用者登记册。

Building a tobacco user registry by extracting multiple smoking behaviors from clinical notes.

机构信息

Dartmouth College, HB 7922, 03755, Hanover, NH, USA.

Dartmouth College, HB 7261, 03755, Hanover, NH, USA.

出版信息

BMC Med Inform Decis Mak. 2019 Jul 25;19(1):141. doi: 10.1186/s12911-019-0863-3.

Abstract

BACKGROUND

Usage of structured fields in Electronic Health Records (EHRs) to ascertain smoking history is important but fails in capturing the nuances of smoking behaviors. Knowledge of smoking behaviors, such as pack year history and most recent cessation date, allows care providers to select the best care plan for patients at risk of smoking attributable diseases.

METHODS

We developed and evaluated a health informatics pipeline for identifying complete smoking history from clinical notes in EHRs. We utilized 758 patient-visit notes (from visits between 03/28/2016 and 04/04/2016) from our local EHR in addition to a public dataset of 502 clinical notes from the 2006 i2b2 Challenge to assess the performance of this pipeline. We used a machine-learning classifier to extract smoking status and a comprehensive set of text processing regular expressions to extract pack years and cessation date information from these clinical notes.

RESULTS

We identified smoking status with an F1 score of 0.90 on both the i2b2 and local data sets. Regular expression identification of pack year history in the local test set was 91.7% sensitive and 95.2% specific, but due to variable context the pack year extraction was incomplete in 25% of cases, extracting packs per day or years smoked only. Regular expression identification of cessation date was 63.2% sensitive and 94.6% specific.

CONCLUSIONS

Our work indicates that the development of an EHR-based Smokers' Registry containing information relating to smoking behaviors, not just status, from free-text clinical notes using an informatics pipeline is feasible. This pipeline is capable of functioning in external EHRs, reducing the amount of time and money needed at the institute-level to create a Smokers' Registry for improved identification of patient risk and eligibility for preventative and early detection services.

摘要

背景

在电子健康记录 (EHR) 中使用结构化字段来确定吸烟史很重要,但无法捕捉吸烟行为的细微差别。了解吸烟行为,如包年历史和最近的戒烟日期,可以让护理提供者为有吸烟相关疾病风险的患者选择最佳的护理计划。

方法

我们开发并评估了一个健康信息学管道,用于从 EHR 中的临床记录中识别完整的吸烟史。我们利用了来自我们当地 EHR 的 758 份患者就诊记录(就诊时间为 2016 年 3 月 28 日至 4 月 4 日),以及来自 2006 年 i2b2 挑战赛的 502 份临床记录的公共数据集,以评估该管道的性能。我们使用机器学习分类器来提取吸烟状态,并使用一组全面的文本处理正则表达式来从这些临床记录中提取包年和戒烟日期信息。

结果

我们在 i2b2 和本地数据集上都将吸烟状态的 F1 分数识别为 0.90。在本地测试集中,使用正则表达式识别包年历史的灵敏度为 91.7%,特异性为 95.2%,但由于上下文的变化,25%的情况下提取的只是每天的烟包数或吸烟年数。使用正则表达式识别戒烟日期的灵敏度为 63.2%,特异性为 94.6%。

结论

我们的工作表明,使用信息学管道从自由文本临床记录中基于 EHR 开发一个包含与吸烟行为相关信息的吸烟者登记处,而不仅仅是状态,是可行的。该管道能够在外部 EHR 中运行,减少了机构层面创建吸烟者登记处以改善患者风险识别和获得预防和早期检测服务资格所需的时间和金钱。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验