College of Information and Computer Science, University of Massachusetts Amherst, Amherst, MA, United States.
Department of Medicine, University of Massachusetts Medical School, Worcester, MA, United States.
AMIA Annu Symp Proc. 2021 Jan 25;2020:860-869. eCollection 2020.
A bleeding event is a common adverse drug reaction amongst patients on anticoagulation and factors critically into a clinician's decision to prescribe or continue anticoagulation for atrial fibrillation. However, bleeding events are not uniformly captured in the administrative data of electronic health records (EHR). As manual review is prohibitively expensive, we investigate the effectiveness of various natural language processing (NLP) methods for automatic extraction of bleeding events. Using our expert-annotated 1,079 de-identified EHR notes, we evaluated state-of-the-art NLP models such as biLSTM-CRF with language modeling, and different BERT variants for six entity types. On our dataset, the biLSTM-CRF surpassed other models resulting in a macro F1-score of 0.75 whereas the performance difference is negligible for sentence and document-level predictions with the best macro F1-scores of 0.84 and 0.96, respectively. Our error analyses suggest that the models' incorrect predictions can be attributed to variability in entity spans, memorization, and missing negation signals.
出血事件是抗凝治疗患者中常见的药物不良反应,这对临床医生决定开处或继续开处心房颤动抗凝药物有重要影响。然而,出血事件在电子健康记录(EHR)的管理数据中并未被统一捕捉到。由于手动审查过于昂贵,我们研究了各种自然语言处理(NLP)方法在自动提取出血事件方面的有效性。我们使用经过专家注释的 1079 份去识别 EHR 记录,评估了最先进的 NLP 模型,例如具有语言建模的双向长短时记忆循环神经网络-条件随机场(biLSTM-CRF),以及不同的 BERT 变体在六种实体类型上的表现。在我们的数据集上,biLSTM-CRF 优于其他模型,其宏 F1 得分为 0.75,而在句子和文档级预测方面,最佳宏 F1 得分分别为 0.84 和 0.96,性能差异可以忽略不计。我们的错误分析表明,模型的错误预测可以归因于实体跨度、记忆和缺失否定信号的变化。