Health Fidelity, San Mateo, CA, USA.
VA Salt Lake City Health Care System, University of Utah, Salt Lake City, UT, USA.
Drug Saf. 2019 Jan;42(1):147-156. doi: 10.1007/s40264-018-0763-y.
Identifying occurrences of medication side effects and adverse drug events (ADEs) is an important and challenging task because they are frequently only mentioned in clinical narrative and are not formally reported.
We developed a natural language processing (NLP) system that aims to identify mentions of symptoms and drugs in clinical notes and label the relationship between the mentions as indications or ADEs. The system leverages an existing word embeddings model with induced word clusters for dimensionality reduction. It employs a conditional random field (CRF) model for named entity recognition (NER) and a random forest model for relation extraction (RE).
Final performance of each model was evaluated separately and then combined on a manually annotated evaluation set. The micro-averaged F1 score was 80.9% for NER, 88.1% for RE, and 61.2% for the integrated systems. Outputs from our systems were submitted to the NLP Challenges for Detecting Medication and Adverse Drug Events from Electronic Health Records (MADE 1.0) competition (Yu et al. in http://bio-nlp.org/index.php/projects/39-nlp-challenges , 2018). System performance was evaluated in three tasks (NER, RE, and complete system) with multiple teams submitting output from their systems for each task. Our RE system placed first in Task 2 of the challenge and our integrated system achieved third place in Task 3.
Adding to the growing number of publications that utilize NLP to detect occurrences of ADEs, our study illustrates the benefits of employing innovative feature engineering.
识别药物副作用和药物不良事件(ADE)的发生是一项重要且具有挑战性的任务,因为它们通常仅在临床叙述中提及,并未正式报告。
我们开发了一种自然语言处理(NLP)系统,旨在识别临床记录中症状和药物的提及,并将提及之间的关系标记为指示或 ADE。该系统利用现有的词嵌入模型和诱导的词聚类进行降维。它采用条件随机场(CRF)模型进行命名实体识别(NER),并采用随机森林模型进行关系提取(RE)。
分别评估每个模型的最终性能,然后在手动标注的评估集上进行组合。NER 的微平均 F1 分数为 80.9%,RE 的微平均 F1 分数为 88.1%,集成系统的微平均 F1 分数为 61.2%。我们系统的输出已提交给从电子健康记录中检测药物和药物不良事件的自然语言处理挑战赛(MADE 1.0)(Yu 等人,http://bio-nlp.org/index.php/projects/39-nlp-challenges, 2018)。系统性能在三个任务(NER、RE 和完整系统)中进行评估,多个团队为每个任务提交其系统的输出。我们的 RE 系统在挑战的任务 2 中排名第一,我们的集成系统在任务 3 中排名第三。
除了越来越多的利用 NLP 检测 ADE 发生的出版物外,我们的研究还说明了采用创新特征工程的好处。