Suppr超能文献

用于命名实体识别的上下文感知注意力多级特征融合

Context-Aware Attentive Multilevel Feature Fusion for Named Entity Recognition.

作者信息

Yang Zhiwei, Ma Jing, Chen Hechang, Zhang Jiawei, Chang Yi

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Jun 8;PP. doi: 10.1109/TNNLS.2022.3178522.

Abstract

In the era of information explosion, named entity recognition (NER) has attracted widespread attention in the field of natural language processing, as it is fundamental to information extraction. Recently, methods of NER based on representation learning, e.g., character embedding and word embedding, have demonstrated promising recognition results. However, existing models only consider partial features derived from words or characters while failing to integrate semantic and syntactic information, e.g., capitalization, inter-word relations, keywords, and lexical phrases, from multilevel perspectives. Intuitively, multilevel features can be helpful when recognizing named entities from complex sentences. In this study, we propose a novel attentive multilevel feature fusion (AMFF) model for NER, which captures the multilevel features in the current context from various perspectives. It consists of four components to, respectively, capture the local character-level (CL), global character-level (CG), local word-level (WL), and global word-level (WG) features in the current context. In addition, we further define document-level features crafted from other sentences to enhance the representation learning of the current context. To this end, we introduce a novel context-aware attentive multilevel feature fusion (CAMFF) model based on AMFF, to fully leverage document-level features from all the previous inputs. The obtained multilevel features are then fused and fed into a bidirectional long short-term memory (BiLSTM)-conditional random field (CRF) network for the final sequence labeling. Extensive experiments on four benchmark datasets demonstrate that our proposed AMFF and CAMFF models outperform a set of state-of-the-art baseline methods and the features learned from multiple levels are complementary.

摘要

在信息爆炸的时代,命名实体识别(NER)在自然语言处理领域引起了广泛关注,因为它是信息提取的基础。最近,基于表示学习的NER方法,例如字符嵌入和词嵌入,已经取得了很有前景的识别结果。然而,现有模型仅考虑从单词或字符派生的部分特征,而未能从多层次角度整合语义和句法信息,例如大写、词间关系、关键词和词汇短语。直观地说,多层次特征在从复杂句子中识别命名实体时会有所帮助。在本研究中,我们提出了一种用于NER的新颖的注意力多层次特征融合(AMFF)模型,该模型从各种角度捕获当前上下文中的多层次特征。它由四个组件组成,分别捕获当前上下文中的局部字符级(CL)、全局字符级(CG)、局部词级(WL)和全局词级(WG)特征。此外,我们进一步定义了从其他句子精心制作的文档级特征,以增强当前上下文的表示学习。为此,我们基于AMFF引入了一种新颖的上下文感知注意力多层次特征融合(CAMFF)模型,以充分利用所有先前输入中的文档级特征。然后将获得的多层次特征进行融合,并输入到双向长短期记忆(BiLSTM)-条件随机场(CRF)网络中进行最终的序列标注。在四个基准数据集上进行的广泛实验表明,我们提出的AMFF和CAMFF模型优于一组先进的基线方法,并且从多个层次学习到的特征是互补的。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验