基于全局指针和对抗训练的中文命名实体识别。

Named entity recognition for Chinese based on global pointer and adversarial training.

机构信息

Key Laboratory of Deep-time Geography and Environment Reconstruction and Applications, MNR & College of Computer Science and Cyber Security, Chengdu University of Technology, Chengdu, 610059, China.

College of Computer Science and Cyber Security, Chengdu University of Technology, Chengdu, 610059, China.

出版信息

Sci Rep. 2023 Feb 24;13(1):3242. doi: 10.1038/s41598-023-30355-y.

DOI:10.1038/s41598-023-30355-y

PMID:36828907

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9958032/

Abstract

Named entity recognition aims to identify entities from unstructured text and is an important subtask for natural language processing and building knowledge graphs. Most of the existing entity recognition methods use conditional random fields as label decoders or use pointer networks for entity recognition. However, when the number of tags is large, the computational cost of method based on conditional random fields is high and the problem of nested entities cannot be solved. The pointer network uses two modules to identify the first and the last of the entities separately, and a single module can only focus on the information of the first or the last of the entities, but cannot pay attention to the global information of the entities. In addition, the neural network model has the problem of local instability. To solve mentioned problems, a named entity recognition model based on global pointer and adversarial training is proposed. To obtain global entity information, global pointer is used to decode entity information, and rotary relative position information is considered in the model designing to improve the model's perception of position; to solve the model's local instability problem, adversarial training is used to improve the robustness and generalization of the model. The experimental results show that the F1 score of the model are improved on several public datasets of OntoNotes5, MSRA, Resume, and Weibo compared with the existing mainstream models.

摘要

命名实体识别旨在从非结构化文本中识别实体，是自然语言处理和构建知识图谱的重要子任务。大多数现有的实体识别方法使用条件随机场作为标签解码器，或者使用指针网络进行实体识别。但是，当标签数量较大时，基于条件随机场的方法的计算成本较高，并且无法解决嵌套实体的问题。指针网络使用两个模块分别识别实体的第一个和最后一个，单个模块只能关注实体的第一个或最后一个的信息，但不能关注实体的全局信息。此外，神经网络模型存在局部不稳定性问题。为了解决这些问题，提出了一种基于全局指针和对抗训练的命名实体识别模型。为了获取全局实体信息，使用全局指针对实体信息进行解码，并在模型设计中考虑旋转相对位置信息，以提高模型对位置的感知能力；为了解决模型的局部不稳定性问题，使用对抗训练来提高模型的鲁棒性和泛化能力。实验结果表明，与现有的主流模型相比，该模型在 OntoNotes5、MSRA、Resume 和 Weibo 等多个公共数据集上的 F1 得分均有所提高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2df9/9958032/a82158831a76/41598_2023_30355_Fig1_HTML.jpg

相似文献

Named entity recognition for Chinese based on global pointer and adversarial training.基于全局指针和对抗训练的中文命名实体识别。

Sci Rep. 2023 Feb 24;13(1):3242. doi: 10.1038/s41598-023-30355-y.

Precursor-induced conditional random fields: connecting separate entities by induction for improved clinical named entity recognition.诱导前条件随机场：通过诱导连接独立实体以提高临床命名实体识别。

BMC Med Inform Decis Mak. 2019 Jul 15;19(1):132. doi: 10.1186/s12911-019-0865-1.

Joint extraction of Chinese medical entities and relations based on RoBERTa and single-module global pointer.基于RoBERTa和单模块全局指针的中医实体与关系联合提取

BMC Med Inform Decis Mak. 2024 Jul 31;24(1):218. doi: 10.1186/s12911-024-02577-1.

MMBERT: a unified framework for biomedical named entity recognition.MMBERT：一个用于生物医学命名实体识别的统一框架。

Med Biol Eng Comput. 2024 Jan;62(1):327-341. doi: 10.1007/s11517-023-02934-8. Epub 2023 Oct 14.

Chinese clinical named entity recognition with radical-level feature and self-attention mechanism.基于词干级特征和自注意力机制的中文临床命名实体识别。

J Biomed Inform. 2019 Oct;98:103289. doi: 10.1016/j.jbi.2019.103289. Epub 2019 Sep 18.

Adversarial training based lattice LSTM for Chinese clinical named entity recognition.基于对抗训练的格 lattice LSTM 进行中文临床命名实体识别。

J Biomed Inform. 2019 Nov;99:103290. doi: 10.1016/j.jbi.2019.103290. Epub 2019 Sep 23.

Disease named entity recognition by combining conditional random fields and bidirectional recurrent neural networks.结合条件随机场和双向递归神经网络的疾病命名实体识别

Database (Oxford). 2016 Oct 24;2016. doi: 10.1093/database/baw140. Print 2016.

Application of cascade binary pointer tagging in joint entity and relation extraction of Chinese medical text.级联二值指针标注在中文医学文本联合实体和关系抽取中的应用。

Math Biosci Eng. 2022 Jul 27;19(10):10656-10672. doi: 10.3934/mbe.2022498.

Named entity recognition of Chinese electronic medical records based on a hybrid neural network and medical MC-BERT.基于混合神经网络和医学 MC-BERT 的中文电子病历命名实体识别。

BMC Med Inform Decis Mak. 2022 Dec 1;22(1):315. doi: 10.1186/s12911-022-02059-2.

Chinese Clinical Named Entity Recognition with ALBERT and MHA Mechanism.基于ALBERT和MHA机制的中文临床命名实体识别

Evid Based Complement Alternat Med. 2022 May 23;2022:2056039. doi: 10.1155/2022/2056039. eCollection 2022.

引用本文的文献

Information extraction from green channel textual records on expressways using hybrid deep learning.基于混合深度学习的高速公路绿色通道文本记录信息提取

Sci Rep. 2024 Dec 28;14(1):31269. doi: 10.1038/s41598-024-82681-4.

Sequential lexicon enhanced bidirectional encoder representations from transformers: Chinese named entity recognition using sequential lexicon enhanced BERT.基于变换器的序列词典增强双向编码器表征：使用序列词典增强的BERT进行中文命名实体识别

PeerJ Comput Sci. 2024 Oct 18;10:e2344. doi: 10.7717/peerj-cs.2344. eCollection 2024.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于全局指针和对抗训练的中文命名实体识别。

Named entity recognition for Chinese based on global pointer and adversarial training.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献