自动检测临床文本中的物质使用状况和相关信息。

Automated Detection of Substance-Use Status and Related Information from Clinical Text.

机构信息

Department of Computer Science, College of Computer Science and Information Technology, King Faisal University, Al-Ahsa 31982, Saudi Arabia.

Department of Computer Science, Durham University, Upper Mountjoy Campus, Stockton Road, Durham DH1 3LE, UK.

出版信息

Sensors (Basel). 2022 Dec 8;22(24):9609. doi: 10.3390/s22249609.

DOI:10.3390/s22249609

PMID:36559979

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9783118/

Abstract

This study aims to develop and evaluate an automated system for extracting information related to patient substance use (smoking, alcohol, and drugs) from unstructured clinical text (medical discharge records). The authors propose a four-stage system for the extraction of the substance-use status and related attributes (type, frequency, amount, quit-time, and period). The first stage uses a keyword search technique to detect sentences related to substance use and to exclude unrelated records. In the second stage, an extension of the NegEx negation detection algorithm is developed and employed for detecting the negated records. The third stage involves identifying the temporal status of the substance use by applying windowing and chunking methodologies. Finally, in the fourth stage, regular expressions, syntactic patterns, and keyword search techniques are used in order to extract the substance-use attributes. The proposed system achieves an F1-score of up to 0.99 for identifying substance-use-related records, 0.98 for detecting the negation status, and 0.94 for identifying temporal status. Moreover, F1-scores of up to 0.98, 0.98, 1.00, 0.92, and 0.98 are achieved for the extraction of the amount, frequency, type, quit-time, and period attributes, respectively. Natural Language Processing (NLP) and rule-based techniques are employed efficiently for extracting substance-use status and attributes, with the proposed system being able to detect substance-use status and attributes over both sentence-level and document-level data. Results show that the proposed system outperforms the compared state-of-the-art substance-use identification system on an unseen dataset, demonstrating its generalisability.

摘要

本研究旨在开发和评估一种从非结构化临床文本（医疗出院记录）中提取与患者物质使用（吸烟、饮酒和药物）相关信息的自动化系统。作者提出了一个四阶段系统，用于提取物质使用状态和相关属性（类型、频率、数量、戒烟时间和时间段）。第一阶段使用关键字搜索技术来检测与物质使用相关的句子，并排除不相关的记录。在第二阶段，开发并应用了 NegEx 否定检测算法的扩展版本来检测否定记录。第三阶段通过应用窗口化和分块方法来确定物质使用的时间状态。最后，在第四阶段，使用正则表达式、语法模式和关键字搜索技术来提取物质使用属性。所提出的系统在识别与物质使用相关的记录方面达到了高达 0.99 的 F1 分数，在检测否定状态方面达到了 0.98，在识别时间状态方面达到了 0.94。此外，在提取数量、频率、类型、戒烟时间和时间段属性方面，分别达到了高达 0.98、0.98、1.00、0.92 和 0.98 的 F1 分数。自然语言处理 (NLP) 和基于规则的技术被有效地用于提取物质使用状态和属性，所提出的系统能够在句子级和文档级数据上检测物质使用状态和属性。结果表明，所提出的系统在未见过的数据集上优于比较的物质使用识别系统，证明了其泛化能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acd1/9783118/0cc7b783f99f/sensors-22-09609-g001.jpg

相似文献

Automated Detection of Substance-Use Status and Related Information from Clinical Text.自动检测临床文本中的物质使用状况和相关信息。

Sensors (Basel). 2022 Dec 8;22(24):9609. doi: 10.3390/s22249609.

Negation recognition in clinical natural language processing using a combination of the NegEx algorithm and a convolutional neural network.使用 NegEx 算法和卷积神经网络相结合的方法进行临床自然语言处理中的否定识别。

BMC Med Inform Decis Mak. 2023 Oct 13;23(1):216. doi: 10.1186/s12911-023-02301-5.

DEEPEN: A negation detection system for clinical text incorporating dependency relation into NegEx.DEEPEN：一种将依存关系纳入NegEx的临床文本否定检测系统。

J Biomed Inform. 2015 Apr;54:213-9. doi: 10.1016/j.jbi.2015.02.010. Epub 2015 Mar 16.

Natural language processing and machine learning to enable automatic extraction and classification of patients' smoking status from electronic medical records.自然语言处理和机器学习可实现从电子病历中自动提取和分类患者的吸烟状况。

Ups J Med Sci. 2020 Nov;125(4):316-324. doi: 10.1080/03009734.2020.1792010. Epub 2020 Jul 22.

Extracting social determinants of health from electronic health records using natural language processing: a systematic review.利用自然语言处理从电子健康记录中提取健康的社会决定因素：系统评价。

J Am Med Inform Assoc. 2021 Nov 25;28(12):2716-2727. doi: 10.1093/jamia/ocab170.

Automated detection of substance use information from electronic health records for a pediatric population.从电子健康记录中自动检测儿科人群的物质使用信息。

J Am Med Inform Assoc. 2021 Sep 18;28(10):2116-2127. doi: 10.1093/jamia/ocab116.

Extracting social determinants of health events with transformer-based multitask, multilabel named entity recognition.基于转换器的多任务、多标签命名实体识别技术提取健康事件的社会决定因素。

J Am Med Inform Assoc. 2023 Jul 19;30(8):1379-1388. doi: 10.1093/jamia/ocad046.

[A customized method for information extraction from unstructured text data in the electronic medical records].[一种从电子病历非结构化文本数据中提取信息的定制方法]

Beijing Da Xue Xue Bao Yi Xue Ban. 2018 Apr 18;50(2):256-263.

Extracting Alcohol and Substance Abuse Status from Clinical Notes: The Added Value of Nursing Data.从临床记录中提取酒精和药物滥用状况：护理数据的附加价值。

Stud Health Technol Inform. 2019 Aug 21;264:1056-1060. doi: 10.3233/SHTI190386.

Extracting information from the text of electronic medical records to improve case detection: a systematic review.从电子病历文本中提取信息以改善病例检测：一项系统综述

J Am Med Inform Assoc. 2016 Sep;23(5):1007-15. doi: 10.1093/jamia/ocv180. Epub 2016 Feb 5.

本文引用的文献

Med7: A transferable clinical natural language processing model for electronic health records.Med7：一种可转移的电子健康记录临床自然语言处理模型。

Artif Intell Med. 2021 Aug;118:102086. doi: 10.1016/j.artmed.2021.102086. Epub 2021 May 18.

Multi-domain clinical natural language processing with MedCAT: The Medical Concept Annotation Toolkit.多领域临床自然语言处理与 MedCAT：医学概念标注工具包。

Artif Intell Med. 2021 Jul;117:102083. doi: 10.1016/j.artmed.2021.102083. Epub 2021 May 1.

Detecting Social and Behavioral Determinants of Health with Structured and Free-Text Clinical Data.利用结构化和自由文本临床数据检测健康的社会和行为决定因素。

Appl Clin Inform. 2020 Jan;11(1):172-181. doi: 10.1055/s-0040-1702214. Epub 2020 Mar 4.

Fine-Tuning Bidirectional Encoder Representations From Transformers (BERT)-Based Models on Large-Scale Electronic Health Record Notes: An Empirical Study.基于大规模电子健康记录笔记对基于变换器的双向编码器表征（BERT）模型进行微调：一项实证研究。

JMIR Med Inform. 2019 Sep 12;7(3):e14830. doi: 10.2196/14830.

BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT：一种用于生物医学文本挖掘的预训练生物医学语言表示模型。

Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.

The use of Electronic Health Records to Support Population Health: A Systematic Review of the Literature.利用电子健康记录支持人群健康：文献系统评价。

J Med Syst. 2018 Sep 29;42(11):214. doi: 10.1007/s10916-018-1075-6.

Extraction of Ejection Fraction from Echocardiography Notes for Constructing a Cohort of Patients having Heart Failure with reduced Ejection Fraction (HFrEF).从超声心动图记录中提取射血分数，以构建射血分数降低的心力衰竭（HFrEF）患者队列。

J Med Syst. 2018 Sep 25;42(11):209. doi: 10.1007/s10916-018-1066-7.

Data Mining Algorithms and Techniques in Mental Health: A Systematic Review.数据挖掘算法和技术在精神健康中的应用：系统评价。

J Med Syst. 2018 Jul 21;42(9):161. doi: 10.1007/s10916-018-1018-2.

Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review.利用电子健康记录数据开发深度学习模型的机遇与挑战：系统综述。

J Am Med Inform Assoc. 2018 Oct 1;25(10):1419-1428. doi: 10.1093/jamia/ocy068.

Medical Text Classification Using Convolutional Neural Networks.使用卷积神经网络的医学文本分类

Stud Health Technol Inform. 2017;235:246-250.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

自动检测临床文本中的物质使用状况和相关信息。

Automated Detection of Substance-Use Status and Related Information from Clinical Text.

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献