College of Information and Computer Sciences, University of Massachusetts, Amherst, MA, USA.
Department of Quantitative Health Sciences and Radiology, University of Massachusetts Medical School, Worcester, MA, USA.
Drug Saf. 2019 Jan;42(1):99-111. doi: 10.1007/s40264-018-0762-z.
This work describes the Medication and Adverse Drug Events from Electronic Health Records (MADE 1.0) corpus and provides an overview of the MADE 1.0 2018 challenge for extracting medication, indication, and adverse drug events (ADEs) from electronic health record (EHR) notes.
The goal of MADE is to provide a set of common evaluation tasks to assess the state of the art for natural language processing (NLP) systems applied to EHRs supporting drug safety surveillance and pharmacovigilance. We also provide benchmarks on the MADE dataset using the system submissions received in the MADE 2018 challenge.
The MADE 1.0 challenge has released an expert-annotated cohort of medication and ADE information comprising 1089 fully de-identified longitudinal EHR notes from 21 randomly selected patients with cancer at the University of Massachusetts Memorial Hospital. Using this cohort as a benchmark, the MADE 1.0 challenge designed three shared NLP tasks. The named entity recognition (NER) task identifies medications and their attributes (dosage, route, duration, and frequency), indications, ADEs, and severity. The relation identification (RI) task identifies relations between the named entities: medication-indication, medication-ADE, and attribute relations. The third shared task (NER-RI) evaluates NLP models that perform the NER and RI tasks jointly. In total, 11 teams from four countries participated in at least one of the three shared tasks, and 41 system submissions were received in total.
The best systems F scores for NER, RI, and NER-RI were 0.82, 0.86, and 0.61, respectively. Ensemble classifiers using the team submissions improved the performance further, with an F score of 0.85, 0.87, and 0.66 for the three tasks, respectively.
MADE results show that recent progress in NLP has led to remarkable improvements in NER and RI tasks for the clinical domain. However, some room for improvement remains, particularly in the NER-RI task.
本工作描述了从电子健康记录(MADE 1.0)语料库中提取药物、适应证和不良药物事件(ADE)的药物和不良药物事件(MADE 1.0)1.0 年挑战,并提供了 MADE 1.0 年挑战的概述,用于从电子健康记录(EHR)记录中提取药物、适应证和不良药物事件(ADE)。
MADE 的目标是提供一组共同的评估任务,以评估应用于支持药物安全监测和药物警戒的电子健康记录(EHR)的自然语言处理(NLP)系统的最新技术。我们还使用 MADE 2018 挑战赛中收到的系统提交结果在 MADE 数据集上提供了基准。
MADE 1.0 挑战赛发布了一个由专家注释的药物和 ADE 信息队列,该队列由马萨诸塞大学纪念医院 21 名随机选择的癌症患者的 1089 份完全去识别的纵向 EHR 记录组成。使用该队列作为基准,MADE 1.0 挑战赛设计了三个共享的 NLP 任务。命名实体识别(NER)任务识别药物及其属性(剂量、途径、持续时间和频率)、适应证、ADE 和严重程度。关系识别(RI)任务识别命名实体之间的关系:药物-适应证、药物-ADE 和属性关系。第三个共享任务(NER-RI)评估了联合执行 NER 和 RI 任务的 NLP 模型。共有来自四个国家的 11 个团队参加了至少一个共享任务,共收到 41 个系统提交。
NER、RI 和 NER-RI 的最佳系统 F 分数分别为 0.82、0.86 和 0.61。使用团队提交的集成分类器进一步提高了性能,三个任务的 F 分数分别为 0.85、0.87 和 0.66。
MADE 结果表明,最近在 NLP 方面的进展导致了临床领域 NER 和 RI 任务的显著改进。然而,仍有一些改进的空间,特别是在 NER-RI 任务中。