Suppr超能文献

基于多层 CNN 和注意力机制的中文临床命名实体识别。

Incorporating multi-level CNN and attention mechanism for Chinese clinical named entity recognition.

机构信息

Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, China.

Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, China.

出版信息

J Biomed Inform. 2021 Apr;116:103737. doi: 10.1016/j.jbi.2021.103737. Epub 2021 Mar 15.

Abstract

Named entity recognition (NER) is a fundamental task in Chinese natural language processing (NLP) tasks. Recently, Chinese clinical NER has also attracted continuous research attention because it is an essential preparation for clinical data mining. The prevailing deep learning method for Chinese clinical NER is based on long short-term memory (LSTM) network. However, the recurrent structure of LSTM makes it difficult to utilize GPU parallelism which to some extent lowers the efficiency of models. Besides, when the sentence is long, LSTM can hardly capture global context information. To address these issues, we propose a novel and efficient model completely based on convolutional neural network (CNN) which can fully utilize GPU parallelism to improve model efficiency. Moreover, we construct multi-level CNN to capture short-term and long-term context information. We also design a simple attention mechanism to obtain global context information which is conductive to improving model performance in sequence labeling tasks. Besides, a data augmentation method is proposed to expand the data volume and try to explore more semantic information. Extensive experiments show that our model achieves competitive performance with higher efficiency compared with other remarkable clinical NER models.

摘要

命名实体识别(NER)是中文自然语言处理(NLP)任务中的一项基本任务。最近,中文临床 NER 也引起了持续的研究关注,因为它是临床数据挖掘的重要准备工作。目前用于中文临床 NER 的主流深度学习方法基于长短期记忆(LSTM)网络。然而,LSTM 的递归结构使其难以利用 GPU 并行性,在一定程度上降低了模型的效率。此外,当句子较长时,LSTM 很难捕获全局上下文信息。针对这些问题,我们提出了一种完全基于卷积神经网络(CNN)的新型高效模型,该模型可以充分利用 GPU 并行性来提高模型效率。此外,我们构建了多层次的 CNN 来捕获短期和长期上下文信息。我们还设计了一种简单的注意力机制来获取全局上下文信息,这有助于提高序列标注任务中的模型性能。此外,还提出了一种数据增强方法来扩展数据量并尝试探索更多的语义信息。大量实验表明,与其他显著的临床 NER 模型相比,我们的模型具有更高的效率和竞争力。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验