嵌套生物医学事件的条件概率联合提取：基于神经网络的统一提取框架设计

Conditional Probability Joint Extraction of Nested Biomedical Events: Design of a Unified Extraction Framework Based on Neural Networks.

作者信息

Wang Yan, Wang Jian, Lu Huiyi, Xu Bing, Zhang Yijia, Banbhrani Santosh Kumar, Lin Hongfei

机构信息

School of Computer Science and Technology, Dalian University of Technology, Dalian, China.

Department of Pharmacy, The Second Affiliated Hospital of Dalian Medical University, Dalian, China.

出版信息

JMIR Med Inform. 2022 Jun 7;10(6):e37804. doi: 10.2196/37804.

DOI:10.2196/37804

PMID:35671070

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9214613/

Abstract

BACKGROUND

Event extraction is essential for natural language processing. In the biomedical field, the nested event phenomenon (event A as a participating role of event B) makes extracting this event more difficult than extracting a single event. Therefore, the performance of nested biomedical events is always underwhelming. In addition, previous works relied on a pipeline to build an event extraction model, which ignored the dependence between trigger recognition and event argument detection tasks and produced significant cascading errors.

OBJECTIVE

This study aims to design a unified framework to jointly train biomedical event triggers and arguments and improve the performance of extracting nested biomedical events.

METHODS

We proposed an end-to-end joint extraction model that considers the probability distribution of triggers to alleviate cascading errors. Moreover, we integrated the syntactic structure into an attention-based gate graph convolutional network to capture potential interrelations between triggers and related entities, which improved the performance of extracting nested biomedical events.

RESULTS

The experimental results demonstrated that our proposed method achieved the best F1 score on the multilevel event extraction biomedical event extraction corpus and achieved a favorable performance on the biomedical natural language processing shared task 2011 Genia event corpus.

CONCLUSIONS

Our conditional probability joint extraction model is good at extracting nested biomedical events because of the joint extraction mechanism and the syntax graph structure. Moreover, as our model did not rely on external knowledge and specific feature engineering, it had a particular generalization performance.

摘要

背景

事件提取对于自然语言处理至关重要。在生物医学领域，嵌套事件现象（事件A作为事件B的参与角色）使得提取此类事件比提取单个事件更加困难。因此，嵌套生物医学事件的提取性能一直不尽人意。此外，以往的工作依赖于流水线方式构建事件提取模型，这种方式忽略了触发词识别和事件论元检测任务之间的依赖性，并产生了显著的级联错误。