• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于语言模型表示嵌入的半监督双向长短期记忆与条件随机场命名实体识别模型

Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations.

作者信息

Zhang Min, Geng Guohua, Chen Jing

机构信息

School of Information Science and Technology, Northwest University, Xi'an 710127, China.

School of Engineering and Technology, Xi'an Fanyi University, 710105 Xi'an, China.

出版信息

Entropy (Basel). 2020 Feb 22;22(2):252. doi: 10.3390/e22020252.

DOI:10.3390/e22020252
PMID:33286026
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7516692/
Abstract

Increasingly, popular online museums have significantly changed the way people acquire cultural knowledge. These online museums have been generating abundant amounts of cultural relics data. In recent years, researchers have used deep learning models that can automatically extract complex features and have rich representation capabilities to implement named-entity recognition (NER). However, the lack of labeled data in the field of cultural relics makes it difficult for deep learning models that rely on labeled data to achieve excellent performance. To address this problem, this paper proposes a semi-supervised deep learning model named SCRNER (Semi-supervised model for Cultural Relics' Named Entity Recognition) that utilizes the bidirectional long short-term memory (BiLSTM) and conditional random fields (CRF) model trained by seldom labeled data and abundant unlabeled data to attain an effective performance. To satisfy the semi-supervised sample selection, we propose a repeat-labeled (relabeled) strategy to select samples of high confidence to enlarge the training set iteratively. In addition, we use embeddings from language model (ELMo) representations to dynamically acquire word representations as the input of the model to solve the problem of the blurred boundaries of cultural objects and Chinese characteristics of texts in the field of cultural relics. Experimental results demonstrate that our proposed model, trained on limited labeled data, achieves an effective performance in the task of named entity recognition of cultural relics.

摘要

越来越多的热门在线博物馆显著改变了人们获取文化知识的方式。这些在线博物馆生成了大量的文物数据。近年来,研究人员使用能够自动提取复杂特征且具有丰富表示能力的深度学习模型来实现命名实体识别(NER)。然而,文物领域缺乏标注数据使得依赖标注数据的深度学习模型难以取得优异的性能。为了解决这个问题,本文提出了一种名为SCRNER(文物命名实体识别半监督模型)的半监督深度学习模型,该模型利用双向长短期记忆(BiLSTM)和由少量标注数据及大量未标注数据训练的条件随机场(CRF)模型来获得有效的性能。为了满足半监督样本选择,我们提出一种重复标注(重新标注)策略,以选择高置信度的样本,从而迭代地扩大训练集。此外,我们使用语言模型(ELMo)表示的嵌入来动态获取词表示作为模型的输入,以解决文物领域中文物边界模糊和文本具有中国特色的问题。实验结果表明,我们提出的模型在有限的标注数据上进行训练,在文物命名实体识别任务中取得了有效的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f6/7516692/729bbf467e0f/entropy-22-00252-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f6/7516692/b34f39ab47bd/entropy-22-00252-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f6/7516692/e3932621e601/entropy-22-00252-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f6/7516692/6019b3c64b1c/entropy-22-00252-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f6/7516692/fa8c2338c99c/entropy-22-00252-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f6/7516692/2785eb853311/entropy-22-00252-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f6/7516692/6e2e2751347f/entropy-22-00252-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f6/7516692/729bbf467e0f/entropy-22-00252-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f6/7516692/b34f39ab47bd/entropy-22-00252-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f6/7516692/e3932621e601/entropy-22-00252-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f6/7516692/6019b3c64b1c/entropy-22-00252-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f6/7516692/fa8c2338c99c/entropy-22-00252-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f6/7516692/2785eb853311/entropy-22-00252-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f6/7516692/6e2e2751347f/entropy-22-00252-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f6/7516692/729bbf467e0f/entropy-22-00252-g007.jpg

相似文献

1
Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations.基于语言模型表示嵌入的半监督双向长短期记忆与条件随机场命名实体识别模型
Entropy (Basel). 2020 Feb 22;22(2):252. doi: 10.3390/e22020252.
2
Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records.从中文电子病历中提取垂体腺瘤的临床命名实体。
BMC Med Inform Decis Mak. 2022 Mar 23;22(1):72. doi: 10.1186/s12911-022-01810-z.
3
Fast and effective biomedical named entity recognition using temporal convolutional network with conditional random field.使用带有条件随机场的时间卷积网络进行快速有效的生物医学命名实体识别。
Math Biosci Eng. 2020 May 12;17(4):3553-3566. doi: 10.3934/mbe.2020200.
4
Chinese Clinical Named Entity Recognition in Electronic Medical Records: Development of a Lattice Long Short-Term Memory Model With Contextualized Character Representations.电子病历中的中文临床命名实体识别:基于上下文特征表示的格长短期记忆模型的开发
JMIR Med Inform. 2020 Sep 4;8(9):e19848. doi: 10.2196/19848.
5
Comparing Different Methods for Named Entity Recognition in Portuguese Neurology Text.比较葡萄牙语神经病学文本中命名实体识别的不同方法。
J Med Syst. 2020 Feb 28;44(4):77. doi: 10.1007/s10916-020-1542-8.
6
Named entity recognition from Chinese adverse drug event reports with lexical feature based BiLSTM-CRF and tri-training.基于词汇特征的 BiLSTM-CRF 和三训练的中药不良事件报告命名实体识别。
J Biomed Inform. 2019 Aug;96:103252. doi: 10.1016/j.jbi.2019.103252. Epub 2019 Jul 16.
7
Adversarial active learning for the identification of medical concepts and annotation inconsistency.对抗式主动学习在医学概念识别和标注不一致性中的应用。
J Biomed Inform. 2020 Aug;108:103481. doi: 10.1016/j.jbi.2020.103481. Epub 2020 Jul 18.
8
DeIDNER Model: A Neural Network Named Entity Recognition Model for Use in the De-identification of Clinical Notes.DeIDNER模型:一种用于临床记录去识别化的神经网络命名实体识别模型。
Biomed Eng Syst Technol Int Jt Conf BIOSTEC Revis Sel Pap. 2022 Feb;5:640-647. doi: 10.5220/0010884500003123.
9
Knowledge Graph Completion for the Chinese Text of Cultural Relics Based on Bidirectional Encoder Representations from Transformers with Entity-Type Information.基于带有实体类型信息的变换器双向编码器表征的文物中文文本知识图谱补全
Entropy (Basel). 2020 Oct 16;22(10):1168. doi: 10.3390/e22101168.
10
SBLC: a hybrid model for disease named entity recognition based on semantic bidirectional LSTMs and conditional random fields.SBLC:一种基于语义双向 LSTM 和条件随机场的疾病命名实体识别混合模型。
BMC Med Inform Decis Mak. 2018 Dec 7;18(Suppl 5):114. doi: 10.1186/s12911-018-0690-y.

引用本文的文献

1
Knowledge Graph Completion for the Chinese Text of Cultural Relics Based on Bidirectional Encoder Representations from Transformers with Entity-Type Information.基于带有实体类型信息的变换器双向编码器表征的文物中文文本知识图谱补全
Entropy (Basel). 2020 Oct 16;22(10):1168. doi: 10.3390/e22101168.
2
Novel spatiotemporal feature extraction parallel deep neural network for forecasting confirmed cases of coronavirus disease 2019.用于预测2019年冠状病毒病确诊病例的新型时空特征提取并行深度神经网络
Socioecon Plann Sci. 2022 Mar;80:100976. doi: 10.1016/j.seps.2020.100976. Epub 2020 Nov 25.

本文引用的文献

1
Mining e-cigarette adverse events in social media using Bi-LSTM recurrent neural network with word embedding representation.利用带有词嵌入表示的 Bi-LSTM 递归神经网络挖掘社交媒体中的电子烟不良事件。
J Am Med Inform Assoc. 2018 Jan 1;25(1):72-80. doi: 10.1093/jamia/ocx045.
2
A Study of Neural Word Embeddings for Named Entity Recognition in Clinical Text.用于临床文本中命名实体识别的神经词嵌入研究
AMIA Annu Symp Proc. 2015 Nov 5;2015:1326-33. eCollection 2015.
3
Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features.
社交媒体中的药物警戒:使用带有词嵌入聚类特征的序列标注挖掘药物不良反应提及信息。
J Am Med Inform Assoc. 2015 May;22(3):671-81. doi: 10.1093/jamia/ocu041. Epub 2015 Mar 9.
4
Feature selection techniques for maximum entropy based biomedical named entity recognition.基于最大熵的生物医学命名实体识别的特征选择技术。
J Biomed Inform. 2009 Oct;42(5):905-11. doi: 10.1016/j.jbi.2008.12.012. Epub 2009 Jan 23.
5
Long short-term memory.长短期记忆
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.