• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从电子健康记录中自动检测儿科人群的物质使用信息。

Automated detection of substance use information from electronic health records for a pediatric population.

机构信息

Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA.

Department of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, Ohio, USA.

出版信息

J Am Med Inform Assoc. 2021 Sep 18;28(10):2116-2127. doi: 10.1093/jamia/ocab116.

DOI:10.1093/jamia/ocab116
PMID:34333636
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8449626/
Abstract

OBJECTIVE

Substance use screening in adolescence is unstandardized and often documented in clinical notes, rather than in structured electronic health records (EHRs). The objective of this study was to integrate logic rules with state-of-the-art natural language processing (NLP) and machine learning technologies to detect substance use information from both structured and unstructured EHR data.

MATERIALS AND METHODS

Pediatric patients (10-20 years of age) with any encounter between July 1, 2012, and October 31, 2017, were included (n = 3890 patients; 19 478 encounters). EHR data were extracted at each encounter, manually reviewed for substance use (alcohol, tobacco, marijuana, opiate, any use), and coded as lifetime use, current use, or family use. Logic rules mapped structured EHR indicators to screening results. A knowledge-based NLP system and a deep learning model detected substance use information from unstructured clinical narratives. System performance was evaluated using positive predictive value, sensitivity, negative predictive value, specificity, and area under the receiver-operating characteristic curve (AUC).

RESULTS

The dataset included 17 235 structured indicators and 27 141 clinical narratives. Manual review of clinical narratives captured 94.0% of positive screening results, while structured EHR data captured 22.0%. Logic rules detected screening results from structured data with 1.0 and 0.99 for sensitivity and specificity, respectively. The knowledge-based system detected substance use information from clinical narratives with 0.86, 0.79, and 0.88 for AUC, sensitivity, and specificity, respectively. The deep learning model further improved detection capacity, achieving 0.88, 0.81, and 0.85 for AUC, sensitivity, and specificity, respectively. Finally, integrating predictions from structured and unstructured data achieved high detection capacity across all cases (0.96, 0.85, and 0.87 for AUC, sensitivity, and specificity, respectively).

CONCLUSIONS

It is feasible to detect substance use screening and results among pediatric patients using logic rules, NLP, and machine learning technologies.

摘要

目的

青少年物质使用筛查尚未标准化,且通常记录在临床笔记中,而不是在结构化电子健康记录(EHR)中。本研究的目的是整合逻辑规则与最先进的自然语言处理(NLP)和机器学习技术,以从结构化和非结构化 EHR 数据中检测物质使用信息。

材料和方法

纳入 2012 年 7 月 1 日至 2017 年 10 月 31 日期间的任何就诊的儿科患者(10-20 岁,n=3890 例患者;19478 次就诊)。在每次就诊时提取 EHR 数据,人工审查物质使用(酒精、烟草、大麻、阿片类药物、任何使用)情况,并编码为终生使用、当前使用或家庭使用。逻辑规则将结构化 EHR 指标映射到筛查结果。基于知识的 NLP 系统和深度学习模型从非结构化临床叙述中检测物质使用信息。使用阳性预测值、灵敏度、阴性预测值、特异性和接收器工作特征曲线下的面积(AUC)评估系统性能。

结果

数据集包括 17235 个结构化指标和 27141 条临床叙述。对临床叙述的人工审查捕获了 94.0%的阳性筛查结果,而结构化 EHR 数据仅捕获了 22.0%。逻辑规则分别以 1.0 和 0.99 的灵敏度和特异性检测到来自结构化数据的筛查结果。基于知识的系统从临床叙述中检测物质使用信息,AUC、灵敏度和特异性分别为 0.86、0.79 和 0.88。深度学习模型进一步提高了检测能力,AUC、灵敏度和特异性分别为 0.88、0.81 和 0.85。最后,整合结构化和非结构化数据的预测在所有情况下均具有较高的检测能力(AUC、灵敏度和特异性分别为 0.96、0.85 和 0.87)。

结论

使用逻辑规则、NLP 和机器学习技术检测儿科患者的物质使用筛查和结果是可行的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0131/8449626/5ed978c2ee32/ocab116f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0131/8449626/e28a9070cd40/ocab116f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0131/8449626/ef9f34f09655/ocab116f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0131/8449626/4e15f89d93bd/ocab116f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0131/8449626/5ffca1bb5386/ocab116f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0131/8449626/5ed978c2ee32/ocab116f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0131/8449626/e28a9070cd40/ocab116f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0131/8449626/ef9f34f09655/ocab116f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0131/8449626/4e15f89d93bd/ocab116f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0131/8449626/5ffca1bb5386/ocab116f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0131/8449626/5ed978c2ee32/ocab116f5.jpg

相似文献

1
Automated detection of substance use information from electronic health records for a pediatric population.从电子健康记录中自动检测儿科人群的物质使用信息。
J Am Med Inform Assoc. 2021 Sep 18;28(10):2116-2127. doi: 10.1093/jamia/ocab116.
2
A Study of Social and Behavioral Determinants of Health in Lung Cancer Patients Using Transformers-based Natural Language Processing Models.基于变压器的自然语言处理模型研究肺癌患者健康的社会和行为决定因素。
AMIA Annu Symp Proc. 2022 Feb 21;2021:1225-1233. eCollection 2021.
3
Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review.使用机器学习方法进行自然语言处理,以分析来自电子健康记录的非结构化患者报告结局:系统评价。
Artif Intell Med. 2023 Dec;146:102701. doi: 10.1016/j.artmed.2023.102701. Epub 2023 Nov 1.
4
Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.利用基于深度学习的自然语言处理技术从非结构化电子健康记录中分类社会健康决定因素。
J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7.
5
Machine learning and natural language processing (NLP) approach to predict early progression to first-line treatment in real-world hormone receptor-positive (HR+)/HER2-negative advanced breast cancer patients.机器学习和自然语言处理(NLP)方法预测激素受体阳性(HR+)/HER2 阴性晚期乳腺癌患者一线治疗的早期进展。
Eur J Cancer. 2021 Feb;144:224-231. doi: 10.1016/j.ejca.2020.11.030. Epub 2020 Dec 26.
6
Learning to detect and understand drug discontinuation events from clinical narratives.从临床叙述中学习检测和理解药物停用事件。
J Am Med Inform Assoc. 2019 Oct 1;26(10):943-951. doi: 10.1093/jamia/ocz048.
7
Leveraging Natural Language Processing to Improve Electronic Health Record Suicide Risk Prediction for Veterans Health Administration Users.利用自然语言处理提高退伍军人健康管理局用户电子健康记录自杀风险预测
J Clin Psychiatry. 2023 Jun 19;84(4):22m14568. doi: 10.4088/JCP.22m14568.
8
Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review.电子健康记录中自由文本叙述的症状的自然语言处理:系统评价。
J Am Med Inform Assoc. 2019 Apr 1;26(4):364-379. doi: 10.1093/jamia/ocy173.
9
Natural Language Processing Accurately Differentiates Cancer Symptom Information in Electronic Health Record Narratives.自然语言处理能准确区分电子健康记录中的癌症症状信息。
JCO Clin Cancer Inform. 2024 Aug;8:e2300235. doi: 10.1200/CCI.23.00235.
10
External validation of a machine learning classifier to identify unhealthy alcohol use in hospitalized patients.机器学习分类器识别住院患者不健康饮酒的外部验证。
Addiction. 2022 Apr;117(4):925-933. doi: 10.1111/add.15730. Epub 2021 Nov 23.

引用本文的文献

1
Enhancing Substance Use Detection in Clinical Notes with Large Language Models.利用大语言模型增强临床记录中的物质使用检测
Res Sq. 2025 May 15:rs.3.rs-6615981. doi: 10.21203/rs.3.rs-6615981/v1.
2
Speak and You Shall Predict: Evidence That Speech at Initial Cocaine Abstinence Is a Biomarker of Long-Term Drug Use Behavior.开口就能预测:初次戒除可卡因时的言语是长期药物使用行为生物标志物的证据。
Biol Psychiatry. 2025 Jul 1;98(1):65-75. doi: 10.1016/j.biopsych.2025.01.009. Epub 2025 Jan 20.
3
One Third of Alcohol Use Disorder Diagnoses are Missed by ICD Coding.

本文引用的文献

1
A chart review of substance use screening and related documentation among adolescents in outpatient pediatric clinics: implications for practice.门诊儿科诊所中青少年物质使用筛查及相关记录的图表回顾:对实践的启示。
Subst Abuse Treat Prev Policy. 2020 May 25;15(1):36. doi: 10.1186/s13011-020-00276-4.
2
Finding warning markers: Leveraging natural language processing and machine learning technologies to detect risk of school violence.发现预警标记:利用自然语言处理和机器学习技术检测校园暴力风险。
Int J Med Inform. 2020 Jul;139:104137. doi: 10.1016/j.ijmedinf.2020.104137. Epub 2020 Apr 25.
3
Adolescent SBIRT implementation: Generalist vs. Specialist models of service delivery in primary care.
国际疾病分类编码遗漏了三分之一的酒精使用障碍诊断病例。
Subst Use Addctn J. 2025 Apr;46(2):328-336. doi: 10.1177/29767342241288112. Epub 2024 Nov 7.
4
The validity of electronic health data for measuring smoking status: a systematic review and meta-analysis.电子健康数据测量吸烟状况的有效性:系统评价和荟萃分析。
BMC Med Inform Decis Mak. 2024 Feb 2;24(1):33. doi: 10.1186/s12911-024-02416-3.
5
Semi-supervised ROC analysis for reliable and streamlined evaluation of phenotyping algorithms.半监督 ROC 分析用于可靠且精简的表型算法评估。
J Am Med Inform Assoc. 2024 Feb 16;31(3):640-650. doi: 10.1093/jamia/ocad226.
6
Patterns of substance use among adolescents in and out of foster care: An analysis of linked health and child welfare administrative data.寄养内外青少年物质使用模式:健康与儿童福利行政数据关联分析。
Child Abuse Negl. 2023 Dec;146:106473. doi: 10.1016/j.chiabu.2023.106473. Epub 2023 Oct 4.
7
Disparities in cannabis use and documentation in electronic health records among children and young adults.儿童和青年成年人在大麻使用及电子健康记录中的记录差异。
NPJ Digit Med. 2023 Aug 8;6(1):138. doi: 10.1038/s41746-023-00885-w.
8
RegEMR: a natural language processing system to automatically identify premature ovarian decline from Chinese electronic medical records.RegEMR:一个自然语言处理系统,用于从中文电子病历中自动识别卵巢早衰。
BMC Med Inform Decis Mak. 2023 Jul 18;23(1):126. doi: 10.1186/s12911-023-02239-8.
9
Developing an LSTM Model to Identify Surgical Site Infections using Electronic Healthcare Records.使用电子健康记录开发长短期记忆模型以识别手术部位感染
AMIA Jt Summits Transl Sci Proc. 2023 Jun 16;2023:330-339. eCollection 2023.
10
Algorithms to Identify Nonmedical Opioid Use.识别非医疗用途阿片类药物使用的算法。
Curr Pain Headache Rep. 2023 May;27(5):81-88. doi: 10.1007/s11916-023-01104-7. Epub 2023 Apr 6.
青少年 SBIRT 实施:初级保健中全科与专科服务提供模式
J Subst Abuse Treat. 2020 Apr;111:67-72. doi: 10.1016/j.jsat.2020.01.007. Epub 2020 Jan 20.
4
Cohort selection for clinical trials: n2c2 2018 shared task track 1.队列选择用于临床试验:n2c2 2018 共享任务赛道 1。
J Am Med Inform Assoc. 2019 Nov 1;26(11):1163-1171. doi: 10.1093/jamia/ocz163.
5
Clinical Workflow and Substance Use Screening, Brief Intervention, and Referral to Treatment Data in the Electronic Health Records: A National Drug Abuse Treatment Clinical Trials Network Study.电子健康记录中的临床工作流程与物质使用筛查、简短干预及治疗转诊数据:一项国家药物滥用治疗临床试验网络研究
EGEMS (Wash DC). 2019 Aug 1;7(1):35. doi: 10.5334/egems.293.
6
A Real-Time Automated Patient Screening System for Clinical Trials Eligibility in an Emergency Department: Design and Evaluation.一种用于急诊科临床试验资格筛选的实时自动患者筛查系统:设计与评估
JMIR Med Inform. 2019 Jul 24;7(3):e14185. doi: 10.2196/14185.
7
Commentary on Winhusen et al. (2019): Substance use disorders, chronic diseases, and electronic health records-a paradigm for screening and intervention.温胡森等人的评论 (2019):物质使用障碍、慢性疾病和电子健康记录——筛查和干预的范例。
Addiction. 2019 Aug;114(8):1471-1472. doi: 10.1111/add.14659. Epub 2019 Jun 26.
8
Targeting Youth to Prevent Later Substance Use Disorder: An Underutilized Response to the US Opioid Crisis.针对青年预防后期物质使用障碍:美国阿片类药物危机的一种未充分利用的应对措施。
Am J Public Health. 2019 Jun;109(S3):S185-S189. doi: 10.2105/AJPH.2019.305020.
9
Toward a clinical text encoder: pretraining for clinical natural language processing with applications to substance misuse.迈向临床文本编码器:利用药物滥用应用进行临床自然语言处理的预训练
J Am Med Inform Assoc. 2019 Nov 1;26(11):1272-1278. doi: 10.1093/jamia/ocz072.
10
Electronic Health Records Are the Next Frontier for the Genetics of Substance Use Disorders.电子健康记录是物质使用障碍遗传学的下一个前沿领域。
Trends Genet. 2019 May;35(5):317-318. doi: 10.1016/j.tig.2019.01.007. Epub 2019 Feb 21.