从电子健康记录中自动检测儿科人群的物质使用信息。

Automated detection of substance use information from electronic health records for a pediatric population.

机构信息

Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA.

Department of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, Ohio, USA.

出版信息

J Am Med Inform Assoc. 2021 Sep 18;28(10):2116-2127. doi: 10.1093/jamia/ocab116.

DOI:10.1093/jamia/ocab116

PMID:34333636

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8449626/

Abstract

OBJECTIVE

Substance use screening in adolescence is unstandardized and often documented in clinical notes, rather than in structured electronic health records (EHRs). The objective of this study was to integrate logic rules with state-of-the-art natural language processing (NLP) and machine learning technologies to detect substance use information from both structured and unstructured EHR data.

MATERIALS AND METHODS

Pediatric patients (10-20 years of age) with any encounter between July 1, 2012, and October 31, 2017, were included (n = 3890 patients; 19 478 encounters). EHR data were extracted at each encounter, manually reviewed for substance use (alcohol, tobacco, marijuana, opiate, any use), and coded as lifetime use, current use, or family use. Logic rules mapped structured EHR indicators to screening results. A knowledge-based NLP system and a deep learning model detected substance use information from unstructured clinical narratives. System performance was evaluated using positive predictive value, sensitivity, negative predictive value, specificity, and area under the receiver-operating characteristic curve (AUC).

RESULTS

The dataset included 17 235 structured indicators and 27 141 clinical narratives. Manual review of clinical narratives captured 94.0% of positive screening results, while structured EHR data captured 22.0%. Logic rules detected screening results from structured data with 1.0 and 0.99 for sensitivity and specificity, respectively. The knowledge-based system detected substance use information from clinical narratives with 0.86, 0.79, and 0.88 for AUC, sensitivity, and specificity, respectively. The deep learning model further improved detection capacity, achieving 0.88, 0.81, and 0.85 for AUC, sensitivity, and specificity, respectively. Finally, integrating predictions from structured and unstructured data achieved high detection capacity across all cases (0.96, 0.85, and 0.87 for AUC, sensitivity, and specificity, respectively).

CONCLUSIONS

It is feasible to detect substance use screening and results among pediatric patients using logic rules, NLP, and machine learning technologies.

摘要

目的

青少年物质使用筛查尚未标准化，且通常记录在临床笔记中，而不是在结构化电子健康记录（EHR）中。本研究的目的是整合逻辑规则与最先进的自然语言处理（NLP）和机器学习技术，以从结构化和非结构化 EHR 数据中检测物质使用信息。

材料和方法

纳入 2012 年 7 月 1 日至 2017 年 10 月 31 日期间的任何就诊的儿科患者（10-20 岁，n=3890 例患者；19478 次就诊）。在每次就诊时提取 EHR 数据，人工审查物质使用（酒精、烟草、大麻、阿片类药物、任何使用）情况，并编码为终生使用、当前使用或家庭使用。逻辑规则将结构化 EHR 指标映射到筛查结果。基于知识的 NLP 系统和深度学习模型从非结构化临床叙述中检测物质使用信息。使用阳性预测值、灵敏度、阴性预测值、特异性和接收器工作特征曲线下的面积（AUC）评估系统性能。

结果

数据集包括 17235 个结构化指标和 27141 条临床叙述。对临床叙述的人工审查捕获了 94.0%的阳性筛查结果，而结构化 EHR 数据仅捕获了 22.0%。逻辑规则分别以 1.0 和 0.99 的灵敏度和特异性检测到来自结构化数据的筛查结果。基于知识的系统从临床叙述中检测物质使用信息，AUC、灵敏度和特异性分别为 0.86、0.79 和 0.88。深度学习模型进一步提高了检测能力，AUC、灵敏度和特异性分别为 0.88、0.81 和 0.85。最后，整合结构化和非结构化数据的预测在所有情况下均具有较高的检测能力（AUC、灵敏度和特异性分别为 0.96、0.85 和 0.87）。

结论

使用逻辑规则、NLP 和机器学习技术检测儿科患者的物质使用筛查和结果是可行的。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

从电子健康记录中自动检测儿科人群的物质使用信息。

Automated detection of substance use information from electronic health records for a pediatric population.

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

CONCLUSIONS

目的

材料和方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

从电子健康记录中自动检测儿科人群的物质使用信息。

Automated detection of substance use information from electronic health records for a pediatric population.

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

CONCLUSIONS

目的

材料和方法

结果

结论

相似文献

引用本文的文献

本文引用的文献