• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于自注意力双向长短时记忆条件随机场中上下文嵌入的准确灾害实体识别。

Accurate disaster entity recognition based on contextual embeddings in self-attentive BiLSTM-CRF.

作者信息

Hafsa Noor E, Alzoubi Hadeel Mohammed, Almutlq Atikah Saeed

机构信息

Department of Computer Science, College of Computer Science and Information Technology, King Faisal University, Al Ahsa, Saudi Arabia.

出版信息

PLoS One. 2025 Mar 26;20(3):e0318262. doi: 10.1371/journal.pone.0318262. eCollection 2025.

DOI:10.1371/journal.pone.0318262
PMID:40138352
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11940654/
Abstract

Automated extraction of disaster-related named entities is crucial for gathering pertinent information during natural or human-made crises. Timely and reliable data is vital for effective disaster management, benefiting humanitarian response authorities, law enforcement agencies, and other concerned organizations. Online news media plays a pivotal role in disseminating crisis-related information during emergencies and facilitating post-hazard disaster response operations. To extract relevant named entities, contextual embedding features prove instrumental. In this study, we investigate the automatic extraction of disaster-related named entities from an annotated dataset of 1000 online news articles. These articles are carefully annotated with 14 crisis-specific entities obtained from relevant ontologies. To generate contextual vector representations of words, we construct a novel word embedding model inspired by Word2vec. These contextual word embedding features, combined with lexicon features, are encoded using a novel contextualized deep Bi-directional LSTM network augmented with self-attention and conditional random field (CRF) layers. We compare the performance of our proposed model with existing word embedding approaches. Through extensive evaluation on an independent test set of 200 articles that includes more than 80,000 tokens, our context-sensitive optimized NER model achieves impressive results at the sentence level. With a Precision of 92%, Recall of 91%, Accuracy of 87%, and an F1-score of 92%, our model outperforms those utilizing general and non-contextual word embeddings, including fine-tuned and contextual BERT models, showcasing its superior performance.

摘要

自动提取与灾害相关的命名实体对于在自然或人为危机期间收集相关信息至关重要。及时且可靠的数据对于有效的灾害管理至关重要,这有益于人道主义救援机构、执法机构及其他相关组织。在线新闻媒体在紧急情况下传播与危机相关的信息以及促进灾后灾害应对行动中发挥着关键作用。为了提取相关命名实体,上下文嵌入特征证明很有帮助。在本研究中,我们从1000篇在线新闻文章的注释数据集中研究与灾害相关的命名实体的自动提取。这些文章用从相关本体中获得的14个特定于危机的实体进行了仔细注释。为了生成单词的上下文向量表示,我们构建了一个受Word2vec启发的新颖词嵌入模型。这些上下文词嵌入特征与词汇特征相结合,使用一个新颖的上下文深度双向LSTM网络进行编码,该网络增加了自注意力和条件随机场(CRF)层。我们将我们提出的模型的性能与现有的词嵌入方法进行比较。通过对包含超过80,000个词元的200篇文章的独立测试集进行广泛评估,我们的上下文敏感优化命名实体识别模型在句子级别取得了令人印象深刻的结果。我们的模型的精确率为92%,召回率为91%,准确率为87%,F1分数为92%,优于那些使用通用和非上下文词嵌入的模型,包括微调的和上下文的BERT模型,展示了其卓越的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd7/11940654/489d1b5b4fd0/pone.0318262.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd7/11940654/d02954c21b07/pone.0318262.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd7/11940654/58ff6c57aa19/pone.0318262.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd7/11940654/e2e7f13fdc9d/pone.0318262.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd7/11940654/489d1b5b4fd0/pone.0318262.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd7/11940654/d02954c21b07/pone.0318262.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd7/11940654/58ff6c57aa19/pone.0318262.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd7/11940654/e2e7f13fdc9d/pone.0318262.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd7/11940654/489d1b5b4fd0/pone.0318262.g007.jpg

相似文献

1
Accurate disaster entity recognition based on contextual embeddings in self-attentive BiLSTM-CRF.基于自注意力双向长短时记忆条件随机场中上下文嵌入的准确灾害实体识别。
PLoS One. 2025 Mar 26;20(3):e0318262. doi: 10.1371/journal.pone.0318262. eCollection 2025.
2
Extracting comprehensive clinical information for breast cancer using deep learning methods.利用深度学习方法提取乳腺癌全面临床信息。
Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.
3
Using Twitter Data to Monitor Natural Disaster Social Dynamics: A Recurrent Neural Network Approach with Word Embeddings and Kernel Density Estimation.利用 Twitter 数据监测自然灾害社会动态:基于词嵌入和核密度估计的递归神经网络方法。
Sensors (Basel). 2019 Apr 11;19(7):1746. doi: 10.3390/s19071746.
4
Analyzing transfer learning impact in biomedical cross-lingual named entity recognition and normalization.分析迁移学习在生物医学跨语言命名实体识别和标准化中的影响。
BMC Bioinformatics. 2021 Dec 17;22(Suppl 1):601. doi: 10.1186/s12859-021-04247-9.
5
Combining Contextualized Embeddings and Prior Knowledge for Clinical Named Entity Recognition: Evaluation Study.结合上下文嵌入和先验知识进行临床命名实体识别:评估研究
JMIR Med Inform. 2019 Nov 13;7(4):e14850. doi: 10.2196/14850.
6
Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records.从中文电子病历中提取垂体腺瘤的临床命名实体。
BMC Med Inform Decis Mak. 2022 Mar 23;22(1):72. doi: 10.1186/s12911-022-01810-z.
7
Biomedical named entity recognition using deep neural networks with contextual information.基于上下文信息的深度神经网络的生物医学命名实体识别。
BMC Bioinformatics. 2019 Dec 27;20(1):735. doi: 10.1186/s12859-019-3321-4.
8
The quest for better clinical word vectors: Ontology based and lexical vector augmentation versus clinical contextual embeddings.寻求更好的临床词汇向量:基于本体的和词汇向量扩充与临床上下文嵌入。
Comput Biol Med. 2021 Jul;134:104433. doi: 10.1016/j.compbiomed.2021.104433. Epub 2021 Apr 28.
9
DeIDNER Model: A Neural Network Named Entity Recognition Model for Use in the De-identification of Clinical Notes.DeIDNER模型:一种用于临床记录去识别化的神经网络命名实体识别模型。
Biomed Eng Syst Technol Int Jt Conf BIOSTEC Revis Sel Pap. 2022 Feb;5:640-647. doi: 10.5220/0010884500003123.
10
An imConvNet-based deep learning model for Chinese medical named entity recognition.基于 imConvNet 的深度学习模型在中文医疗命名实体识别中的应用。
BMC Med Inform Decis Mak. 2022 Nov 21;22(1):303. doi: 10.1186/s12911-022-02049-4.

本文引用的文献

1
Distribution, sources, and fate of nitrate in groundwater in agricultural areas of Southern Alberta, Canada.加拿大艾伯塔省南部农业区地下水中硝酸盐的分布、来源及归宿
Biogeochemistry. 2025;168(1):18. doi: 10.1007/s10533-025-01209-8. Epub 2025 Feb 6.
2
Deep learning-based methods for natural hazard named entity recognition.基于深度学习的自然灾害命名实体识别方法。
Sci Rep. 2022 Mar 17;12(1):4598. doi: 10.1038/s41598-022-08667-2.
3
Chemlistem: chemical named entity recognition using recurrent neural networks.Chemlistem:使用循环神经网络的化学命名实体识别
J Cheminform. 2018 Dec 6;10(1):59. doi: 10.1186/s13321-018-0313-8.
4
An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition.基于注意力机制的 BiLSTM-CRF 方法在文档级化学命名实体识别中的应用。
Bioinformatics. 2018 Apr 15;34(8):1381-1388. doi: 10.1093/bioinformatics/btx761.
5
Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding.通过区域嵌入实现文本分类的半监督卷积神经网络。
Adv Neural Inf Process Syst. 2015 Dec;28:919-927.